Friday, August 7, 2020

Oldies But Goodies: Data Models and Physical Independence

Note: To appreciate the stability of a sound foundation vs the industry's fad-driven cookbook practice, I am re-publishing some of the articles and reader exchanges from the old (2000-06), giving you the opportunity to judge for yourself how well my claims/arguments hold up and whether the industry has progressed at all. I am adding comments on re-publication where necessary. Long pieces are broken into smaller parts for fast reading.

From "Little Relationship to Relational" originally posted on March 29, 2001.

“E.F. ("Ted") Codd conceived of his relational model for databases while working at IBM in 1969. Codd's approach took a clue from first-order predicate logic, the basis of a large number of other mathematical systems and presented itself [sic] in terms of set theory, leaving the physical definition of the data undefined and implementation dependent. In June of 1970, Codd laid down much of his extensive groundwork for the model in his article, "A Relational Model of Data for Large Shared Data Banks" published in the Communications of the ACM, a highly regarded professional journal published by the Association for Computing Machinery. Buoyed by an intense reaction against the ad hoc data models offered by the physically oriented mainframe databases, Codd's rigid separation of the logical model, with its rigorous mathematical underpinnings, from the less elegant realities of hardware engineering was revolutionary in its day. Codd and his relational ideas blazed across the academic computing landscape over the next few years.”

Monday, July 20, 2020

Oldies But Goodies: Data Independence and "Physical Denormalization"

Note: I am re-publishing some of the articles and reader exchanges from the old DBDebunk (2000-06). How well do they hold up -- have industry knowledge and practice progressed? Judge for yourself and appreciate the difference between a sound foundation and the fad-driven cookbook approach.

January 2, 2001

ML: ... one of the "4 great lies" is "I denormalize for performance." You state that normalization is a logical concept and, since performance is a physical concept, denormalization for performance reasons is impossible (i.e., it doesn't make sense). What term would you use to describe changing the physical database design to be different from the logical design to enhance performance? Because normalization is a logical concept, you imply that this is not called denormalization.

Friday, July 10, 2020

Oldies But Goodies: Skyscrapers with Shack Foundations

Ed. Note: I am re-publishing some of the posts (slightly refined) from the old dbdebunk (2000-06) to demonstrate how well they hold up to this day, and how not much has improved in the industry -- quite the opposite. The following is the first editorial with which I started dbdebunk in 2000.

June 4, 2000

“Well, it's really a judgment call and I think a lot of experience comes into it. It's a little bit like building a shack. Say you want to build a skyscraper, and you started out building a shack and you just keep trying to add onto it. After a while you have this severe structural problem ... So there is a fallacy to the build-upon-a-simple structure approach. Sometimes you get up to three stories and you have to do some major structural changes, and I just accept that.”
--Wayne Ratliffe, developer of dBase
“Client Servers were a tremendous mistake. And we are sorry that we sold it to you. Instead of applications running on the desktop and data sitting on the server, everything will be Internet based. The only things running on the desktop will be a browser and a word processor. What people want is simple, inexpensive hardware that functions as a window on to the Net. The PC was ludicrously complex with stacks of manuals, helplines and IT support needed to make it function. Client server was supposed to alleviate this problem, but it was a step in the wrong direction. We are paying through the nose to be ignorant.”
--Larry Ellison, CEO, Oracle Corp.

Sunday, June 28, 2020

TYFK: Misconceptions About the Relational Model

“The most popular data model in DBMS is the Relational Model. It is more scientific a model than others. This model is based on first-order predicate logic and defines a table as an n-ary relation. The main highlights of this model are:
  • Data is stored in tables called relations.
  • Relations can be normalized, [in which case] values saved are atomic values.
  • Each row in a relation contains a unique value.
  • Each column in a relation contains values from a same domain.”

Each "Test Your Foundation Knowledge" post presents one or more misconceptions about data fundamentals. To test your knowledge, first try to detect them, then proceed to read our debunking, which is based on the current understanding of the RDM, distinct from whatever has passed for it in the industry to date. If there isn't a match, you can acquire the knowledge by checking out our POSTS, BOOKS, PAPERS, LINKS (or, better, organize one of our on-site SEMINARS, which can be customized to specific needs).

Friday, June 12, 2020

Semantics and the Relational Model

“The RDM is semantically weak ... struggles with consistent granularity and has limitations at the property level... it has no concept of data flow ... it is an incomplete theory. Great for its time but needs something better now ... it uses ill defined and linguistically suspect labels ... it has no rules for semantic accuracy ... this just makes the RDM 1% of the truth ... the RDM should have solved this all by now ... but it has clearly not. You fail to see the reality of the failure of RDM in the real world ... this is your choice. I understand why you cling to it ... it is a most excellent theory that I respect greatly ... [but o]pen minds make progress...” 
Thus in a LinkedIn exchange. Criticism of the RDM almost always reflects poor foundation knowledge and lack of familiarity with the history of the field, and as we shall see, this one is not different. It is often triggered by what I call the "fad-to-fad cookbook approach", one of the latest fads being the industry's revelational "discovery" of semantics.

Thursday, May 28, 2020

No Such Thing As "Current Relational Data Models"

“... the concept of a state group is indeed a missing modeling concept in relational/current data models...”

Thus in a LinkedIn exchange. I don't know what a "state group" is, but I spent almost six decades debunking the misuses of data model in general and the abuses of the RDM in particular and I smell them from miles away. While the time when lack of foundation knowledge shocked me is long gone, practitioners' total unawareness of and indifference to it, and poor reasoning in a field founded on logic never ceases to amaze me.

What exactly are "relational/current data models"?

Sunday, May 10, 2020

TYFK: What Is A Database Relationship?

Note: This is a re-write of an earlier post. About TYFK posts (Test Your Foundation Knowledge) see the post insert below.

“Here two or more table[s] are related with each other. This is Database relationship. Database relationship is used a lot ... [in] relational database management systems ... shortly called RDBMS. Here is Join_data [sic] table and Interview_data table. For creating a relational database management system both of the table[s] must have a common field. Here Employee_ID is a common field ... Database relationship types: One-To-One relation, One-To-many relation, Many-to-many relation. Minimum one common field is essential in all the tables. The data type of common field and field size will be same in all the tables.”
First try to detect the misconceptions, then check against our debunking. If there isn't a match, you can acquire the necessary foundation knowledge in our POSTS, BOOKS, PAPERS, LINKS or, better, organize one of our on-site SEMINARS, which can be customized to specific needs.

View My Stats