Sunday, January 28, 2018

This Week

1. Database truth of the week

"Relvars introduce a concept of assignment, which has no counterpart in either FOPL or set theory. If you add it to those formalisms you introduce computational completeness, which destroys both decidability (the existence of a general algorithm by which you can determine if an expression is or is not logically valid) and the guarantee that there exists a (query) evaluation procedure that will halt (the existence of a general algorithm by which you can evaluate the truth or falsity of every instantiated predicate expression given those instantiations from any given database). Therefore we must forbid relvars." --David McGoveran

2. What's wrong with this database picture?

"Many data and information modelers talk about all kinds of keys (or identifiers. I'll forego the distinction for now). I hear them talk about primary keys, alternate keys, surrogate keys, technical keys, functional keys, intelligent keys, business keys (for a Data Vault), human keys, natural keys, artificial keys, composite keys, warehouse keys or Dimensional Keys (or Data Warehousing) and whatnot. Then a debate rises on the use (and misuse) of all these keys ... The foremost question we should actually ask ourselves: can we formally disambiguate kinds of keys (at all)? Of all kinds of key, the primary key and the surrogate key gained the most discussion."
"If we take a look at the relational model we only see of one or more attributes that are unique for each tuple in a relation -- no other formal distinction is possible. When we talk about different kinds of keys we base our nomenclature on properties and behavior of the candidate keys. We formally do not have a primary key, it is a choice we make and as such we might treat this key slightly different from all other available keys in a relation. The discussion around primary keys stems more from SQL NULL problems, foreign key constraints and implementing surrogate keys. --Martijn Evers, Kinds of Keys: On the Nature of Key Classifications,

I have been using the proceeds from my monthly blog @AllAnalytics to maintain DBDebunk and keep it free. Unfortunately, AllAnalytics has been discontinued. I appeal to my readers, particularly regular ones: If you deem this site worthy of continuing, please support its upkeep. A regular monthly contribution will ensure this unique material unavailable anywhere else will continue to be free. A generous reader has offered to match all contributions, so please take advantage of his generosity. Thanks.

3. To Laugh or Cry?

"The database world isn’t packaged with mind-bending announcements on a weekly basis, but over the course of a year it never fails to surprise me how many new things we do see, and how unrelenting the progression is. 2017 was no exception, so I want to reflect on some of the interesting new releases including a transactional graph database, a geo-replicated multi-model database, and a new high performance key/value store." --Peter Cooper, A Look at Ten New Database Systems Released in 2017

4. Publications

5. Housekeeping:

To work around Blogger limitations, the labels are mostly abbreviations or acronyms of the terms listed on the FUNDAMENTALS page. For detailed instructions on how to understand and use the labels in conjunction with the FUNDAMENTALS page, see the ABOUT page. The 2017 and 2016 posts, including earlier posts rewritten in 2017 are relabeled. As other older posts are rewritten, they will also be relabeled, but in the meantime, use Blogger search for them.

6. Oldie But Goodie

DataBase Debunking clarifications

And Now for Something Completely Different:  The PostWest - The Decadence Stage

Must Read of the Week



Pinch Me

Upside Down and Backwards

Book of the Week


Video of the Week

Nothing to Hide

Site of the Week

Open Markets Institute

Fuck the Joos: The 2000 Years Old Universally Acceptable Only Hatred Left

Nice People - Let's Give them a State: The Myth of the Palestinian Nation

Note: I will not publish or respond to anonymous comments. If you have something to say, stand behind it. Otherwise don't bother, it'll be ignored.

Do you like this post? Please link back to this article by copying one of the codes below.

URL: HTML link code: BB (forum) link code:


  1. Regarding the confusing blogspot statement "...We formally do not have a primary key, it is a choice we make and as such we might treat this key slightly different from all other available keys in a relation....":

    The word "formally" is misused and suggests a limited perspective which by itself is almost useless in practice. A data design is the formal specification of the input to a formal system, or formal program if you like.

    No discussion of primary keys is complete without dealing with how they are used by a system.

    For example in appendix B of his 1990 book Codd gave this exercise among others:

    "The notation F -> G means that the DBMS must support F if it is to provide full support for G. ... 3. Let F be primary keys and G be view updatability. Show that F -> G."

    People who write about the use of primary keys need to answer this exercise. A clue is the concept of functions as used in Codd 1970 and the meaning of an inverse function.

    ( I agree with McGovern that the term "view updatability" should be deprecated and replaced with something like relation or database updatability.)

    The writer is correct that we do make choices. One that is rarely written about is choosing primary keys to be a fundamental data design starting point or concept. This is convenient but is theoretically not required by relational algebra which could provide equivalent system behavior through derived relations. I don't advocate such an implementation, only this second exercise which puts the concept in a more concrete light.

    1. Wait for my next 2 posts on the subject and my forthcoming paper.

  2. Regarding relvars and assignment, the 1985 book Structure and Interpretation of Computer Programs which I believe is still available online at the MIT website gave more concrete explanation in terms of application language of the computational problems introduced by language assignment.

    Relvars can be a system implementation device which means they only need to be forbidden at the user language level. But more needs to be said because some relvar supporters claim to have produced truly relational dbms's with languages that allow user access to relvars that treat the specification, in Codd's terms the expression, of a subset of a relation as having the same predicate as the relation. Not only does this duplicate the error SQL makes with rows and tables but it is actually taught at the college level.

    The subset is a relation in its own right and its expression is as Codd put it, equivalent to a class of wff's. In practice it is generally not the case that that class is the same class as the expression of the original relation. The implications are far-reaching but generally unrecognized by the database industry.