Sunday, May 16, 2021

TYFK: Data Model, Logical Model and Schema

Note: Each "Test Your Foundation Knowledge" post presents one or more misconceptions about data fundamentals. To test your knowledge, first try to detect them, then proceed to read our debunking, reflecting the current understanding of the RDM, distinct from whatever has passed for it in the industry to date. If there isn't a match, you can review references -- reflecting the current understanding of the RDM, distinct from whatever has passed for it in the industry to date -- which explain and correct the misconceptions. You can acquire further knowledge by checking out our POSTS, BOOKS, PAPERS, LINKS (or, better, organize one of our on-site SEMINARS, which can be customized to specific needs).

“Doesn't the data model come before the schema? I was tasked to build a data model and one of the resources was a schema. Isn't the schema made from the data model?”

“A data model can be different things. A schema can be a data model. Before that, there's a conceptual model, derived from the problem domain, then a logical model, capturing the entities, attributes, and relationships. After that, a schema is implemented based on those two models.”

“Yes, but if the system evolved, in practice you will have the schema (the structure of physical tables) as the ground truth, and you need to extract the logical model from it. In teaching environment of we tend to begin with the logical model and then create tables based on that.”

“this makes a little more sense to me. i thought a default data model would be out there but i can't find one. so i'm basically "recreating" one from the schema. then i assume i'll be adding on to it with third party products.”



DBDebunk was maintained and kept free with the proceeds from my @AllAnalitics column. The site was discontinued in 2018. The content here is not available anywhere else, so if you deem it useful, particularly if you are a regular reader, please help upkeep it by purchasing publications, or donating. On-site seminars and consulting are available.Thank you.

-05/09/21 Re-posted the 
FUNDAMENTALS page, the content of which had mysteriously disappeared.

-03/15/21: Pruned the POSTS page

-12/26/20: Added “Mathematics, machine learning and Wittgenstein to LINKS page

- 08/19 Logical Symmetric Access, Data Sub-language, Kinds of Relations, Database Redundancy and Consistency, paper #2 in the new UNDERSTANDING THE REAL RDM series.
- 02/18 The Key to Relational Keys: A New Understanding, a new edition of paper #4 in the PRACTICAL DATABASE FOUNDATIONS series.
- 04/17 Interpretation and Representation of Database Relations, paper #1 in the new UNDERSTANDING THE REAL RDM series.
- 10/16 THE DBDEBUNK GUIDE TO MISCONCEPTIONS ABOUT DATA FUNDAMENTALS, my latest book (reviewed by Craig Mullins, Todd Everett, Toon Koppelaars, Davide Mauri).

- To work around Blogger limitations, the labels are mostly abbreviations or acronyms of the terms listed on the
FUNDAMENTALS page. For detailed instructions on how to understand and use the labels in conjunction with the that page, see the ABOUT page. The 2017 and 2016 posts, including earlier posts rewritten in 2017 were relabeled accordingly. As other older posts are rewritten, they will also be relabeled. For all other older posts use Blogger search.
- The links to my columns there no longer work. I moved only the 2017 columns to dbdebunk, within which only links to sources external to AllAnalytics may work or not.

I deleted my Facebook account. You can follow me:
- @DBDdebunk on Twitter: will link to new posts to this site, as well as To Laugh or Cry? and What's Wrong with This Picture? posts, and my exchanges on LinkedIn.
- The PostWest blog for monthly samples of global Antisemitism – the only universally acceptable hatred left – as the (traditional) response to the existential crisis of decadence and decline of Western  civilization (including the US).
- @ThePostWest on Twitter where I comment on global #Antisemitism/#AntiZionism and the Arab-Israeli conflict.



  • A data model:

- is not "different things";
- has no "default" and is not built or "recreated with third party products".

  • There are no attributes in a conceptual model;
  • A schema:

- is not and does not "come from" a data model;
- is not the structure of, and not extracted from physical tables.

  • A logical model is not extracted from the schema.


There are three types of model at three levels of representation in database management: conceptual, logical and physical. The four types of model -- conceptual, logical, physical and data -- are routinely confused and the terms used interchangeably in the industry.

While data model is used in the industry to mean "different things", it was introduced by Codd as a formal theory of data that defines a data structure, integrity and manipulation (e.g., the RDM: semantically constrained database relations and the relatonal algebra). A data model is used to formalize conceptual models of reality consisting of entity groups, properties and relationships as logical models consisting of constrained relations with attributes forming a multigroup for database representation, that can be manipulated by the relational algebra to make inferences about reality.

Data model is incorrectly used in the exchange to mean logical model and there is no such thing as a "default logical model" to be "recreated with third party products" (whatever that means). At best we might be able to say that a RDBMS implements an "abstract logical model" that can be used to  produce concrete logical models that represent multiple conceptual models formally in databases.

A logical model captures the entities and groups thereof, their properties and relationships. Relations with attributes represent formally at the logical level the entity groups and entity properties at the conceptual level (i.e., at the conceptual level there are properties represented by attributes at the logical level).

n. A plan, outline, or model.
n. A diagram, or graphical representation, of certain relations of a system of things, without any pretense to the correct representation of them in other respects; in the Kantian philosophy, a product of the imagination intermediate between an image and a concept. being intuitive, and so capable of being observed, like the former, and general or quasi-general, like the latter.

Although a schema is used in the industry to mean either a logical model or the graphical representation thereof, it is important/useful in database management to keep them distinct in one's mind, because the latter does not usually represent all the features of the former. Avoiding the model confusion in the exchange, a logical model "comes from" the data model only in the sense that the former is used to produce the latter. The schema is just a (partial) image thereof (like the map of a geographical area). It is a visualization on a physical medium, but of a logical model -- it certainly is not extracted from the physical model. Incidentally, R-tables are also just visualizations of relations that play no part in RDM; even data in SQL tables -- which are not relations -- are not physically stored as tables.

Further Reading

Levels of Representation Conceptual Modeling, Logical Design and Physical Implementation

What Is a Data Model, and What It Is Not

Models, Models Everywhere, Nor Any Time to Think

Data Model: Neither Business, Nor Logical, Nor Physical Model

What Is a Relational Schema

No comments:

Post a Comment

View My Stats