DEFENDING THE TRANSRELATIONAL™ MODEL
by Ken O’Flaherty

 

 

As someone familiar with the TransRelational™ Model (TRM), including as yet unpublished aspects, I feel compelled to respond to Josh Hewitt’s misguided critique of TRM, The TransRelational Model: Performance Concerns, posted on 11 November 2004, in the Google newsgroup "comp.databases.theory” [Ed. Note: There was an exchange in response to the critique, but both the critique and exchange were deleted by persons unknown. Any inquiries into the critique will be forwarded to Hewitt].

 

Mr. Hewitt's central difficulty stems from his lack of knowledge of those aspects of TRM that are most critical to his analysis. The centerpiece of Mr. Hewitt's critique rests on his contention that TRM likely performs updates and disk access poorly. Yet his knowledge of TRM is derived from sources that intentionally exclude coverage of how TRM operates efficiently in a dynamic disk-based environment - notably Appendix A of C. J. Date’s AN INTRODUCTION TO DATABASE SYSTEMS, 8th Ed., and published patents. Efficient update and disk operations represent crucial aspects of TRM. But information regarding efficient dynamic disk-based implementations of TRM is still available only under non-disclosure agreement, in works such as C. J. Date’s as yet unpublished book manuscript GO FASTER! THE TRANSRELATIONAL APPROACH TO DBMS IMPLEMENTATION [Ed. Note: See More on the “Final NULL in the Coffin” for a list of some of the advantages of TRM].

 

Given Mr. Hewitt's lack of access to such material, how did he arrive at the conclusion that TRM cannot perform updates and disk operations efficiently? He created his own hypothetical dynamic disk-based implementation of TRM, and proceeded to assess its performance. Upon finding that his imagined implementation performed poorly, he concluded that TRM was ill-suited for implementing a truly relational DBMS. In effect, he erected his own straw man, and then proceeded to knock it down.

 

How did Mr. Hewitt create his straw man? He took the data structures and algorithms he found in C.J. Date’s Appendix A, and applied them directly to a dynamic disk-based RDBMS, ignoring Dr. Date's explicit warning that these methods were designed only for use in a static main-memory environment.

 

C.J. Date's Appendix A is tutorial in nature. In order to get across some of the most basic concepts of TRM in a clear and concise manner, Date deliberately ignores updates and secondary storage considerations. He explains that Appendix A “adopts the fiction that the database is (a) read-only and (b) in main memory.” But he is careful to point out that the full TRM includes additional methods for handling disk access and updates, and warns readers: “Please do not conclude that TR[M] is good for read-only, main-memory databases only, however - it is not.” Unfortunately, Mr. Hewitt did just that. [Ed. Note: See Date’s own reaction to Hewitt, A Note on the TransRelational™ Model].

 

It is important here to draw a clear line of distinction between a model and an implementation. TRM, as described by Date in Appendix A, is a model - in fact, it is a subset of the full TRM. As with any model, there are many possible implementations of TRM. Mr. Hewitt’s imagined implementation is one of these - a very poor one, by his own analysis.

 

In making the false assumptions that TRM uses the same algorithms on disk as it uses in memory, and that it provides no new methods for handling updates, Mr. Hewitt ignores the possibility of inventions beyond those already published i.e. further complementary ideas that produce major improvements in disk and update performance. This brings to mind the error made by Rolls Royce rocket engineers in the late 1940’s, who proved to themselves conclusively that it was impossible to launch a craft into space using a rocket engine, and thus that spaceflight was impossible. Although their calculations were correct, their conclusion was obviously wrong. Their mistake was to not conceive the possibility of a companion invention—in that case the idea of a multi-stage rocket—which neatly solved the problem.

 

Mr. Hewitt’s analysis of TRM is analogous to the Rolls Royce analysis of a single-stage rocket. He was clearly handicapped by his lack of access to unpublished materials. Yet his reluctance to accept the possibility of companion inventions is especially odd in light of C.J. Date's explicit pronouncements. And it would not have been unreasonable for Mr. Hewitt to have asked whether an implementation of TRM had ever been built, and if so, if it had been performance-tested. From some of C.J. Date's statements in Appendix A (e.g. "the solutions have been implemented", and “Performance in TR[M] is orders of magnitude better than it is with a direct-image system”), one might assume that C.J. Date had seen a TRM implementation and had been able to assess its performance as, in fact, he had.

 

Not only had Date been exposed to a working TRM implementation—a prototype built by Required Technologies that included update and disk operations—but so have other highly respected database researchers and implementers. Moreover, several potential customers ran their own benchmarks against this prototype using their own real-world data and their own live complex queries. The results were extraordinary. In every case, TRM delivered orders-of-magnitude performance improvements over existing RDBMSs, in a large dynamic disk-based environment. These results can be demonstrated to anyone seriously interested in TRM.

 

While drawing attention to TRM's ability to provide blindingly fast performance in a dynamic disk-based environment, Date was not at liberty to disclose the complementary inventions that enabled this outcome. Given the existing non-disclosure agreements, the most that can currently be said is that the full TRM includes a variety of novel algorithms and data structures for handling dynamic disk-resident data, and that these typically result in database operations executing at or near main memory speeds, in a large dynamic disk-based environment, a fact fully consistent with the existing TRM benchmarks.

 

Unlike the TRM prototype, Mr. Hewitt’s imagined implementation runs into serious problems relating to disk splaying and slow binary searches. Date specifically references these issues in Appendix A, where he writes: “In a disk-based system, will not the zigzagging mean a lot of random access and terrible performance? And can binary search be made to work efficiently on the disk?” Constrained from disclosing confidential information, C.J. Date continues: “Such questions are indeed serious ones, but (sadly) this is not the place to address them; suffice it to say that they can be and have been addressed satisfactorily, and the solutions have been implemented.” I can only confirm that I too have been exposed to methods that neatly solve the splaying and binary search problems referenced by Mr. Hewitt. The result is performance far faster than B-trees, without the negative consequences B-trees entail, as can be verified by the TRM benchmarks.

 

Mr. Hewitt’s critique is inherently flawed. His arguments are only applicable to his incorrectly imagined implementation, which indeed performs poorly. Unfortunately, this leads him to the terribly erroneous conclusion that (in his words) “In its current form…TRM is not a feasible candidate for implementing a relational DBMS.”

 

Compare this conclusion to C.J. Date's. Unlike Mr. Hewitt, Dr. Date had the benefit of a detailed knowledge of the full TRM, including the algorithms and data structures that TRM uses to handle updates and disk access. And Date was also able to witness the performance of a TRM prototype, which confirmed the effectiveness of the TRM design.  Date’s conclusion was that TRM “is likely to prove the most significant development in this field since Codd gave us the relational model” and that TRM “allows us to build DBMSs that - at last! - truly deliver on the full promise of the relational model.”

 

However much I would love to reveal how TRM algorithms and data structures operate in a large dynamic disk-resident system, I regret I cannot do so at the present time. These methods are the intellectual property of Required Technologies, and, due to their extremely high value, can only be disclosed with adequate safeguards in place. Yet, I have some limited good news I can share.

 

Not only does the prototype implementation of TRM (referenced above) still exist, but also a full-blown commercial disk-based updatable RDBMS based on TRM (with standard SQL, ODBC, JDBC, and third-party tool interfaces, plus all standard subsystems) is nearly complete. But for certain business difficulties, this product would already be available. Preliminary performance testing has been most impressive. And steps are currently being taken to resolve the business difficulties, to enable completion of the product - and hopefully the publication of C.J. Date’s manuscript. I greatly hope that this comes about in the relatively near future. All will then be able to verify to their own satisfaction just how greatly TRM outperforms current-generation RDBMSs in a fully updatable disk-based environment.

 

 

Posted 1/28/05