Sunday, December 13, 2015

Moving in Circles: RDBMS-SQL Conflation & Logical-Physical Confusion

In my last post I demonstrated how disregard for the scientific foundation and history of a field, here, database management, leads to Moving in Circles. The piece I debunked was by CTO of VoltDB, one in the "VVV" series of products by Michael Stonebraker (MS). I've recently come across The Traditional RDBMS Wisdom is All Wrong, a presentation by the man himself, that reinforces my point.

First, MS attaches the "old wisdom" to "traditional RDBMS's", of which there are none. There are traditional SQL DBMS's, characterized by poor relational fidelity. It's unfortunate that he reinforces rather than dispels the erroneous conflation of SQL DBMS with RDBMS. SQL is, more than anything else, responsible for the lack of true implementations of the RDM. And who should know this best than a participant in the design of a DBMS with certain advantages over SQL that lost to SQL. Remember Ingres' QUEL?

Second, here are the highlights of the "old wisdom" (praphrased from this summary):

  • Row-based storage does not satisfy DWH;
  • Wrong data storage designs;
  • Buffer pools are not necessary (obviated by cheap memory);
  • Multithreading (due to latching overhead);
  • No dynamic locking;
  • Active-passive logging in clusters;
  • No use of anti-caching (when in-memory format matches the disk format).
I dare the reader to specify what these have to do with the R in RDBMS. Whether they are better or not (it usually depends on circumstances), connecting them in any way with RDBMS reinforces, rather than dispels, not only the conflation of RDM with SQL implementations, the good old the logical-physical confusion (LPC) that refuses to go away. Too many coders, too few thinkers.

There is hardly any writing in which I, or other relational proponents have not reiterated over and over the core relational principle of data independence in general and physical independence (PDI) in particular. PDI gives RDBMS designers complete freedom to deploy any physical means at their disposal and change them at will to optimize performance, without disrupting applications and users.

Note: Ironically, the RDM is criticized for PDI e.g., "what good is RDM if it "does not deal" with physical implementation?", which is upside down and backwards.

Had DBMS implementers ever acquired a sufficient grasp of the RDM and truly implemented it instead of SQL, we would have had products adaptable not only to new physical hardware and software innovations (and changing costs), but also to multiple/varying use cases. Instead of several single-storage DBMS's a la MS's VVV series, we could have had RDBMS implementing several such and deploying them on demand--that was the vision and is the full import of the RDM. This will probably be dismissed as utopia and it will remain so as long as foundation knowledge and the way the industry operates continue to deteriorate. 

46 years after Codd's introduction of the RDM PDI becomes "new wisdom"--if this is not moving in circles, I don't know what is.

No comments:

Post a Comment