Sunday, June 16, 2013

Foreign Keys Part 3: Mandatory or Optional?

In the last two posts I commented on an exchange about foreign keys. In this third and last installments I will address one last issue underlying that exchange: Are FK's mandatory or optional?
... [we] wish to make a point. There is something which is bad design/good design/mandatory/optional. Please stop insisting that Primary and Foreign keys are mandatory. They are good design habits but by no means mandatory.
We, relational proponents, believe in and advocate precision and clarity, which are particularly important in database management. What do the two terms mean in the context of FK's?

FK's are integrity constraints, specifically referential constraints. One would like to hope that when applied to constraints, the term 'optional' does not mean that integrity of data is a matter of mere preference. Rather, what is usually meant--and can be inferred from the exchange--is the locus of enforcement: not by the DBMS in the database (mandatory), but by users outside it (optional).

In the last installment in this series and many other writings I listed multiple reasons for DBMS-enforced integrity, but other than much arm-waving I have yet to see a solid justification for opting out of it. One that is mentioned most often is performance, but I mentioned in one of my recent site updates Stonebraker's lecture that validates my insistence that (physical) implementation and not the (logical) data model is the correct and effective target for maximizing performance.What is more, as long as constraints are enforced, there is no reason to expect better performance external to the DBMS--quite the contrary (why?).

Integrity constraints are an essential component of the data model (by which we mean neither a conceptual, nor a logical model of a specific enterprise). Constraints are the formalization of the informal business rules of any conceptual model. As such, they are not "good" design, they are design (see Business Modeling for Database Design). That's why one of Codd's 12 rules (and relational advantages) is integrity independence.

I cannot help but suspect that underlying the argument is, as usual, poor foundation knowledge. In my above mentioned paper I demonstrate what constraint formulation would entail in the absence of DBMS shorthands, such as FK's and why vendors should offer more than what SQL product do (see also Database Design, Relational Fidelity and SQL,Parts 1-3). It is very unlikely that anybody with a basic grasp of logic and the relational model would not be disabused of user enforced integrity as a good idea and would not resent its poor support by products.

Note: The difference between user-enforced and application-enforced integrity is that the former does not have to be the latter: constraints can be declared and enforced in a layer between the DBMS and applications, but it would still be external to the DBMS and make sense only if the DBMS fails to support shorthands, or the expression of certain constraints, as SQL products do).
However, life is much more complex than a Normalized DB structure. This includes tables serving as event logs; tables, serving as User maintained materialized query tables, tables, serving as supporting structures, reflecting state of complex transactional databases; persistent tables serving as Result Set or Session keepers.
One of my most frequent arguments is that it is precisely because reality is complex that (a) we should not pile up on it tool and practice complexities of their own (b) users cannot be expected to do a better job in formulating and enforcing constraints than the DBMS, particularly in the absence of a  background in logic. That is why I frown upon the trite argument "different DBMS's for different jobs". Codd's genius was an example of abiding by Einstein's famous principle: "everything should be as simple as possible, but not simpler!".

By the way, are we to conclude that tables of the kinds enumerated are not important enough to merit integrity?

It is ironic that who doubt the importance and advantages of full normalization, also question the necessity of DBMS-enforced integrity, because it is the former that minimizes the integrity burden on users (see The Costly Illusion: Normalization, Integrity and Performance).
And I personally believe that if they were truly mandatory, Sybase, Oracle, SQL Server, Ingres, DB2, etc. would require them. Oh, sorry, forgot the SQL standard itself. This is not the relational model we're talking about. These are commercially available RDBMSs which, not surprisingly, DO tend to listen to their customers. If they didn't, they wouldn't be in business!! Since Sybrand is unlikely to get FKs required by the SQL standard or the major RDBMS vendors, it seems that mandatory means that his answer to the question "Do I have to use foreign keys?" is "You would if you worked in my shop!". I'm inclined to agree with that.
Excellent example of the damage that poor foundation knowledge can do, complete with internally inconsistent arguments.

The implied market efficiency requires users and vendors with perfect information. It does not obtain by chance when that assumption does not hold. In the absence of independent criteria for assessing the functionality and quality of tools, vendors can listen to their customers, but if neither are sufficiently educated and informed, what they ask for and produced is not necessarily the most cost-effective.  If users assume that products are by definition the best, what exactly would take the vendors out of business if they fail?

There is hardly a better proof of point than the initial failure by the SQL standard to fully support the relational model in general and integrity in particular. Instead of insisting that integrity is not mandatory, users should deplore it.

Do you like this post? Please link back to this article by copying one of the codes below.

URL: HTML link code: BB (forum) link code:

No comments:

Post a Comment