“Fabian - With respect, maybe it's time to' shake the formal foundations' of data management, especially given the rising costs and increasing segregation of silos.”
“John, if I were to say what I really think, I would be accused of insulting, so I won't. You don't need to respect me, but you better respect formal foundations. Since they are what gives SOUNDNESS to data management practice, what you are really saying is that you don't care about soundness -- do you really intend to take this position? I would not be surprised, because the industry has long "shook" the formal foundations and lack of soundness is precisely what characterizes it. But because there is no longer proper education, practitioners are totally unaware of the relationship between formal foundations and soundness, everything is ad-hoc and arbitrary, yet they fail to recognize the consequences.”[1]Thus an exchange with John Gorman on LinkedIn, in which he posed several questions (that I answered in the last week's post[2]), the subject being the importance of not confusing levels of representation, and, more specifically, avoiding conceptual-logical conflation (CLC)[3].
--LinkedIn.com
Somebody posted a link to my answers on Linkedin and in a comment on it John linked to a Richard Feynman YouTube lecture on "the general differences between the interests and customs of the mathematicians and the physicists". To which I responded that my very point is that, just like physics is not the mathematics used to describe it (a central issue in quantum mechanics), conceptual modeling is not data modeling, the latter is the representation of the former in the database -- they are distinct[4]. This brought to mind some older columns I published on the All Analytics website that no longer exists, so this series is a revision thereof.
------------------------------------------------------------------------------------------------------------------
SUPPORT THIS SITE
I have been using the proceeds from my monthly blog @AllAnalytics to maintain DBDebunk and keep it free. Unfortunately, AllAnalytics has been discontinued. I appeal to my readers, particularly regular ones: If you deem this site worthy of continuing, please support its upkeep. A regular monthly contribution will ensure this unique material unavailable anywhere else will continue to be free. A generous reader has offered to match all contributions, so let's take advantage of his generosity. Purchasing my papers and books will also help. Thank you.
NEW PUBLICATIONS
NEW: The Key to Relational Keys: A New Perspective.
- THE DBDEBUNK GUIDE TO MISCONCEPTIONS ABOUT DATA FUNDAMENTALS, my latest book, is available to order here (Reviews: Craig Mullins, Todd Everett, Toon Koppelaars, Davide Mauri).
- Logical Symmetric Access, Data Sub-language, Kinds of Relations, Database Redundancy and Consistency, paper #2 in the new UNDERSTANDING OF THE REAL RDM series, is available for ordering here.
- Interpretation and Representation of Database Relations, paper #1 in the new UNDERSTANDING OF THE REAL RDM series, is available for ordering here.
I deleted my Facebook account. You can follow me on Twitter:
@dbdebunk: will contain links to new posts to this site, as well as To Laugh or Cry? and What's Wrong with This Picture, which I am bringing back.
@ThePostWest:
will contain evidence for, and my take on the spike in Anti-semitism that
usually accompanies existential crises. The current one is due to the decadent
decline of the West and the corresponding breakdown of the world order.
- To work around Blogger limitations, the labels are mostly abbreviations or acronyms of the terms listed on the FUNDAMENTALS page. For detailed instructions on how to understand and use the labels in conjunction with the FUNDAMENTALS page, see the ABOUT page. The 2017 and 2016 posts, including earlier posts rewritten in 2017 are relabeled. As other older posts are rewritten, they will also be relabeled, but in the meantime, use Blogger search for them.
- Following the discontinuation of AllAnalytics, the links to my columns there no longer work. I moved the 2017 columns to dbdebunk and, time permitting, may gradually move all of them. Within the columns, only the links to sources external to AllAnalytics work.
Mathematical relations are abstractions (i.e., devoid of any real world meaning), and, thus, can contain arbitrary data, and we can arbitrarily apply any operation of the relational algebra (RA) to them. For example, given the two relations A and B:
100 26150 ...the Cartesian product of A with the projection of B on the second attribute yields a relation, the attributes of which are those of A and the second attribute of B, and the tuples of which are all the possible combinations of each tuple of A with every tuple of the projection of B. In mathematical relation theory the result is meaningless with respect to the real world.
110 38170 ...
120 37950 ...
130 33800 ...
140 35420 ...
150 30280 ...
160 27250 ...
290 15340 ...
310 15900 ...
... 100 06-19-1980 ...
... 110 05-16-1958 ...
... 120 12-05-1963 ...
... 130 07-28-1971 ...
... 140 12-15-1976 ...
... 150 02-12-1972 ...
... 160 10-11-1977 ...
... 290 05-30-1980 ...
... 310 09-12-1964 ...
But the RDM is applied relation theory: simple set theory (SST) expressible in first order predicate logic (FOPL) adjusted for applicability to database management. Database relations preserve mathematical properties, but -- distinct from mathematical relations -- are not abstract, but represent in the database sets of facts about perceived real world entities identified during conceptual modeling:
- Tuples of base relations represent axioms (facts assumed to be true);
- Tuples of derived relations represent theorems (i.e., logical conclusions inferred from the axioms);
- A DBMS and database constitute a logical inference (i.e., deduction) engine that derives theorems from axioms.
The data must be consistent with the conceptual model of reality as perceived by the modeler, which means that (1) neither the data content of, (2) nor the RA operations applicable to, database relations can be arbitrary -- both are constrained by conceptual modeling. If A and B were database relations representing facts about employee compensations and project assignments:
COMPENSATIONS {EMP#,SALARY,...}the result of the above Cartesian product (combining each salary with every start date) wouldn't have a "sensible meaning", as a reader put it (i.e., the operation would not correspond to a meaningful query). As another commented, "Most of the real work in any query is planning out what you are asking, how you are asking it, and the meanings assigned." Which is another way of saying that users must understand the semantics (meaning) of the data specified in the conceptual model by the modeler/database designer!, in order to query the database meaningfully.
ASSIGNMENTS {...,EMP#,START_DATE,...}
While it may be clear in this simple example that the operation makes no sense, this is often not the case in practice, as we shall demonstrate in Part 2.
References
[1] Software Wasteland How the Application-Centric Mindset is Hobbling our Enterprises.
[2] Pascal, F., Conceptual Modeling Is Not Data Modeling.
[3] Pascal, F., The Conceptual-Logical Conflation and the Logical-Physical Confusion.
[4] Conceptual Modeling Is Not Data Modeling.
[5] Pascal, F., What Relations Really Are and Why They Are Important.
[6] Pascal, F., What Meaning Means: Business Rules, Predicates, Integrity Constraints and Database Consistency.
[7] Pascal, F., Levels of Representation: Conceptual Modeling, Logical Design and Physical Implementation.
Note: I
will not publish or respond to anonymous comments. If you have something to say,
stand behind it. Otherwise don't bother, it'll be ignored.
No comments:
Post a Comment