Saturday, November 10, 2018

Conceptual Modeling Is Not Data Modeling

“Ok, now that we have those two (Parts 3 and 4 of your series) 'on the table' so to speak, perhaps you would address these questions...
1. Would it be safe to say that facts expressed in a Conceptual model should be verifiable in reality?

2. Are the following facts logically equivalent or are they different:

a) The car with license number 62-JZK-6 has the color aquamarine blue
b) De auto met kenteken 62-JZK-6 heeft de kleur aquamarijnblauw

3. If a previously true fact is found in reality to be verifiably false, would that mean the Conceptual model is wrong or the Logical model, or reality?”

“I'm going to add another:

4. How does RDM handle temporal changes to the 'truth' of statement 2a) when:

a) The owner of the car paints it black.
b) The owner of the license plate legally transfers it to a truck.
c) The owner of the car replaces every single part except the chassis.”

John O'Gorman asked me these questions in a LinkedIn exchange[1] in response to my comments in another exchange on modeling[2], where I alerted to the confusion of levels of representation common in the industry, particularly conceptual-logical conflation(CLC)[3]: calling conceptual modeling data modeling both reflects and induces it.

Online exchanges are not a proper vehicle for learning, particularly foundation knowledge. Which is why I publish free blog posts, and papers and books, to which to refer interested serious data professionals. It just so happened that my just posted four-part series covers the subject at hand[4], so I referred to it, as well as other writings (the answers are already there if one cares to read them). I will not discuss the whole exchanges -- read them and judge for yourself -- but I promised to answer the questions here, where I can do them justice.

John raises primarily conceptual, not data model issues -- the latter are subservient to decisions in the former -- but then asks "how does RDM handle..." From experience, I recognize implicit doubts that the RDM can. As far as we know there is no formal data model[5] that is a superior alternative to the RDM with respect to "handling" conceptual issues (in fact, there is no other formal data model -- i.e., that satisfies Codd's definition -- period).

Since most of the issues involved are covered by McGoveran's work in progress[6] (in which my multi-part series is rooted), to ensure consistency with it I passed the questions by him. As he too pointed out, "Answers that work in all situations require highly complicated discussions and lots of time, and trying to teach someone without proper experience and educational background would be very cumbersome, or an oversimplication via online exchanges." 

Here's what's possible within the constraints of a blog post -- the serious reader is referred to our writings.


I have been using the proceeds from my monthly blog @AllAnalytics to maintain DBDebunk and keep it free. Unfortunately, AllAnalytics has been discontinued. I appeal to my readers, particularly regular ones: If you deem this site worthy of continuing, please support its upkeep. A regular monthly contribution will ensure this unique material unavailable anywhere else will continue to be free. A generous reader has offered to match all contributions, so let's take advantage of his generosity. Purchasing my papers and books will also help. Thank you. 


NEW: The Key to Relational Keys: A New Perspective


I deleted my Facebook account. You can follow me on Twitter:

  • @dbdebunk: will contain links to new posts to this site, as well as To Laugh or Cry? and What's Wrong with This Picture, which I am bringing back.
  • @ThePostWest: will contain evidence for, and my take on the spike in Anti-semitism that usually accompanies existential crises. The current one is due to the decadent decline of the West and the corresponding breakdown of the world order.


  • To work around Blogger limitations, the labels are mostly abbreviations or acronyms of the terms listed on the FUNDAMENTALS page. For detailed instructions on how to understand and use the labels in conjunction with the FUNDAMENTALS page, see the ABOUT page. The 2017 and 2016 posts, including earlier posts rewritten in 2017 are relabeled. As other older posts are rewritten, they will also be relabeled, but in the meantime, use Blogger search for them. 
  • Following the discontinuation of AllAnalytics, the links to my columns there no longer work. I moved the 2017 columns to dbdebunk and, time permitting, may gradually move all of them. Within the columns, only the links to sources external to AllAnalytics work. 

1. Yes, provided the subject of the conceptual model is not an abstract model, theory, simulation, or some such, but the real world.

2. Assuming it is not just a philosophical question, it depends on a number of factors. If: 
  • The context of the statements is the same; and,
  • The transliteration reflects that context (a linguistic issue).
we may be able to infer that they have the same semantic content, and so should formalize as logically equivalent.

3. "Reality" -- whether yours, mine, or some agreed upon reality -- is what it is -- it cannot be "wrong". But perceptions thereof, on which both the creation and verification of a conceptual model are based, can be wrong, and either can render the model wrong.

If an authorized user (1) inserts in the database a tuple representing a false fact that he wrongly perceives as true, and (2) the tuple happens to satisfy the integrity constraints declared to the DBMS, the tuple is accepted in the database and the logical model is logically valid (i.e., consistent with the constraints), but semantically incorrect (i.e., inconsistent with reality)[7].

4. The world is dynamic -- a fact can be true at one time and become false at another -- that is why databases are updatable. A RDBMS can only enforce consistency, not truth, which is the responsibility of trusted authorized users [8], who must update the database when truth changes.

The fact would become false, requiring an update. But a specific color is an extrinsic, not intrinsic property of the entity (car) -- it can change without changing the identity of the entity.

The fact would become false, requiring an update. But not only is a license plate not a proper identifier/name of the entity (the car, or the truck), but not even a property of a vehicle[9] -- it is an identifier of an independent "license plate entity" that is associated with the proper vehicle identifier (VIN) (the schema should have been designed to reflect the possible associations between VINs and license plates).

2c) The conceptual vehicle is perceived as an assembly, the identity and perhaps properties of which depend on its components. What constitutes stable object identification, and finding stable identifiers
when entities consist of components that can change or evolve (i.e., morph through various processes) can be very complicated, and  mistakes can be -- and usually are -- made.
  1. If the chassis/VIN remains the same, usually so does the identity of the vehicle, but its properties and associated components as recorded in the database change. The fact would become false requiring an update.
  2. If the changes are more radical, a new identifier (e.g., VIN) may be required to recognize the entity as entirely new (e.g., for registration of a custom vehicle). The fact would become false, requiring an update consisting of (a) a delete of the old tuple, and (b) an insert of a new tuple.
For such scenarios, to maintain and recover history requires either a temporal RDBMS, or the creation and maintenance of history relations that log all changes and record when tuples were inserted, deleted, or modified
and the datetimes thereof. This is a complex topic outside the scope of the present discussion.


[1] Software Wasteland: How the Application-Centric Mindset is Hobbling our Enterprises.

[2] Fact-Based Modeling and Data Vault.

[3] Pascal, F., The Conceptual-Logical Conflation and the Logical-Physical Confusion.

[4] Pascal, F., Understanding Conceptual vs. Data Modeling, Parts 1-4.

[5] Codd, E.F., Data Models in Database Management.

[6] McGoveran, D., LOGIC FOR SERIOUS DATABASE FOLK, forthcoming (draft chapters).

[7] Pascal, F., Logical Validity and Semantic Correctness.

[8] Pascal, F., Don't Confuse/Conflate Database Consistency with Truth.

[9] Pascal, F., The Key to Relational Keys: A New Understanding.

Note: I will not publish or respond to anonymous comments. If you have something to say, stand behind it. Otherwise don't bother, it'll be ignored.

No comments:

Post a Comment