Thursday, November 25, 2021

Nobody Understands the Relational Model: Semantics, Closure and Database Correctness Part 3

(Title inspired by Richard Feynman)

In Parts 1 and 2 we provided some clarifications following a discussion on LinkedIn about our contention that, conventional wisdom notwithstanding, database relations -- distinct from mathematical relations -- are by definition not just in 1NF, but also in 5NF, as a consequence of which the relational algebra (RA), as currently defined for 1NF closure, produces update anomalies and, thus, is not a proper algebra. In this third part we will use that information to debunk some leftover misunderstandings in the discussion.

Friday, November 19, 2021

OBG: The Fate of Fads -- XML DBMS

Note: To demonstrate the correctness and stability due to a sound theoretical foundation relative to the industry's fad-driven "cookbook" practices, I am re-publishing as "Oldies But Goodies" material from the old DBDebunk.com (2000-06), so that you can judge for yourself how well my arguments hold up and whether the industry has progressed beyond the misconceptions those arguments were intended to dispel. I may revise, break into parts, and/or add comments and/or references.

Remember XML DBMS? At one point it was the fad of the day, similar to today's NoSQL or the old new "knowledge graph" -- "the future that you ignored at the peril of being left behind". As I predicted, it went the way of all fads (ODBMS, Associative DBMS, you name them) together with their "data models" that were nothing of the sort. My prediction was grounded in the same sound foundations I rely on today -- unlike the industry we are progressing it -- that fads lack and which were and still are dismissed, evidence be damned.

Here's a typical example (comments on republication in square brackets).

Thursday, November 11, 2021

Nobody Understands the Relational Model: Semantics, Relational Closure and Database Correctness Part 2

 with David McGoveran 

(Title inspired by Richard Feynman)

In Part 1 we explained that all database relations are, mathematically, relations, but not all relations are database relations, which are in both 1NF and 5NF and we agreed with a statement in a LinkedIn discussion ending as follows: "Update anomalies are not as big of a problem as an algebra where relations aren't closed under join". Unfortunately, update anomalies, closure, and how relational operators were defined are all interrelated and represent an even "bigger problem". Update anomalies are not "bugs", let alone irrelevant, but actually a reflection of  that much bigger problem.

In this second part we delve into that problem.

Friday, November 5, 2021

OBG: Database Consistency and Physical Truth

Note: To demonstrate the correctness and stability due to a sound theoretical foundation relative to the industry's fad-driven "cookbook" practices, I am re-publishing as "Oldies But Goodies" material from the old DBDebunk.com (2000-06), so that you can judge for yourself how well my arguments hold up and whether the industry has progressed beyond the misconceptions those arguments were intended to dispel. I may slightly revise, break into parts, and/or add comments and/or references.

This is an email exchange with a reader responding to my third book.
(Originally posted on 06/21/2001)

“I'm presently reading your book PRACTICAL ISSUES IN DATABASE MANAGEMENT and there are a couple of points that I find a little confusing. I'll start first by saying that I have no formal database oriented education, and I'm attempting to familiarize myself with some of the underlying theories and practices, so that I can further my personal education and career prospects (but aren't we all!). My questions may sound a little bit ignorant, but that would be because I am! (Please note ignorant, not stupid!) I'll quote you directly from the book for this (possibly I'm taking you out of context or missing something important)

Chapter 3, A Matter of Identity: Keys, pg. 75: "Databases represent assertions of fact - propositions - about entities of interest in the real world. The representation must be correct - only true propositions (facts) must be represented."

Now, correct me if I'm wrong with a basic assumption here, but isn't a database simply a model of a "real world" data collection? I would've thought that the intention of a database would be to model real life effectively (and accurately) enough to provide useful data for interpretation. Now obviously this is not an easy process with complex data types, but would it even be possible to have a 100% true proposition with only atomic data types? (i.e. can a simplified model contain only facts?) In my understanding of modeling, any model that fits real life closely enough to be a good statistical representation is a usable model. e.g. Newton's Laws are accurate enough when applied on a local scale, but we need to use Einstein's model of space-time across larger scales. Wouldn't recording only "facts" (which I would presume you mean to be statements that are provable in the objective sense i.e. no interpretation, only investigation or calculation) possibly eliminate the utility of some aspects of the database? Or do we account for the interpretative aspect in the metadata or in some other way?

Essentially, I can see what you're saying, but not necessarily how you've reached the conclusion. Admittedly in an ideal world we should be able to record only facts in a database, but this is not an ideal world. As an example, in surveys we see such questions as "Are you happy with this product?" followed by a rating system of 1-5, or 'completely unhappy to completely happy'. This is an artificial enforcement of a quantitative measure on a qualitative property. How do we account for the fact that this is interpreted data and not calculated or measured?

My questions may have little relevance to database theory in general, but the concept fascinates me!”
View My Stats