Friday, August 7, 2020

OBG: Data Models and Physical Independence

Note: To appreciate the stability of a sound foundation vs the industry's fad-driven cookbook practice, I am re-publishing some of the articles and reader exchanges from the old (2000-06), giving you the opportunity to judge for yourself how well my claims/arguments hold up and whether the industry has progressed at all. I am adding comments on re-publication where necessary. Long pieces are broken into smaller parts for fast reading.

From "Little Relationship to Relational" originally posted on March 29, 2001.

“E.F. ("Ted") Codd conceived of his relational model for databases while working at IBM in 1969. Codd's approach took a clue from first-order predicate logic, the basis of a large number of other mathematical systems and presented itself [sic] in terms of set theory, leaving the physical definition of the data undefined and implementation dependent. In June of 1970, Codd laid down much of his extensive groundwork for the model in his article, "A Relational Model of Data for Large Shared Data Banks" published in the Communications of the ACM, a highly regarded professional journal published by the Association for Computing Machinery. Buoyed by an intense reaction against the ad hoc data models offered by the physically oriented mainframe databases, Codd's rigid separation of the logical model, with its rigorous mathematical underpinnings, from the less elegant realities of hardware engineering was revolutionary in its day. Codd and his relational ideas blazed across the academic computing landscape over the next few years.”

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  SUPPORT THIS SITE
DBDebunk was maintained and kept free with the proceeds from my @AllAnalitics column. The site was discontinued in 2018. The content here is not available anywhere else, so if you deem it useful, particularly if you are a regular reader, please help upkeep it by purchasing publications, or donating. On-site seminars and consulting are available.Thank you.


07/22/20: LINKS update: Added “An Argument for Controlled Natural Languages in Mathematics”, “Let’s Make Set Theory Great Again”.
- 07/21/20 LINKS update: Added “How Gödel’s Proof Works”.

- 08/19 Logical Symmetric Access, Data Sub-language, Kinds of Relations, Database Redundancy and Consistency, paper #2 in the new UNDERSTANDING THE REAL RDM series.
- 02/18 The Key to Relational Keys: A New Understanding, a new edition of paper #4 in the PRACTICAL DATABASE FOUNDATIONS series.
- 04/17 Interpretation and Representation of Database Relations, paper #1 in the new UNDERSTANDING THE REAL RDM series.
- 10/16 THE DBDEBUNK GUIDE TO MISCONCEPTIONS ABOUT DATA FUNDAMENTALS, my latest book (reviewed by Craig Mullins, Todd Everett, Toon Koppelaars, Davide Mauri).

- To work around Blogger limitations, the labels are mostly abbreviations or acronyms of the terms listed on the FUNDAMENTALS page. For detailed instructions on how to understand and use the labels in conjunction with the that page, see the ABOUT page. The 2017 and 2016 posts, including earlier posts rewritten in 2017 were relabeled accordingly. As other older posts are rewritten, they will also be relabeled. For all other older posts use Blogger search.
- The links to my columns there no longer work. I moved only the 2017 columns to dbdebunk, within which only links to sources external to AllAnalytics may work or not.

I deleted my Facebook account. You can follow me:
- @DBDdebunk on Twitter: will link to new posts to this site, as well as To Laugh or Cry? and What's Wrong with This Picture? posts, and my exchanges on LinkedIn.
- @The PostWest blog: Evidence for Antisemitism/AntiZionism – the only universally acceptable hatred – as the (traditional) response to the existential crisis of decadence and decline of Western (including the US)
- @ThePostWest Twitter page where I comment on global #Antisemitism/#AntiZionism and the Arab-Israeli conflict.

A better expression would be 'leaving physical representation and access implementer-defined'. It is very important in this context, however, to dispel a prevalent misconception. There is a tendency in the industry to criticize the relational model for "ignoring the physical level" -- after all, practitioners say, the model must be physically implemented somehow and "it is not practical" to ignore that. What they fail to understand, though, is that physical independence does not mean that implementation details are ignored. Rather, it means that logical models are insulated from such details. Implementers and users are then free to employ any physical techniques they deem necessary for performance, and even change them at will, without affecting what applications and users see at the logical level. If physical implementation details are exposed to users and applications (which tends to be the case with non-relational approaches, including object orientation in particular), then not only is their view of the data contaminated by logically irrelevant aspects, but users are sucked into complex physical considerations for which they have no affinity; and, of course, when physical changes are necessary for performance reasons, applications will be affected and will need to be modified. So it's not a question of the "model being more elegant than hardware realities", but rather of keeping the former independent of the latter.
Comment on re-publication
  • RDM is grounded in that part of simple set theory (SST) that is expressible in first order predicate logic (FOPL).
  • A data model as defined by Codd is a theory-based combination of data structure/integrity and data manipulation that is used to formalize conceptual models of reality as logical models for database representation. The models preceding the RDM -- hierarchic and network -- were not ad-hoc, but despite having a theoretically based foundation (directed graph theory), they have not been fully formalized to date. They could not support physical independence precisely because they require higher logic than FOPL.

No comments:

Post a Comment

View My Stats