ON DK/NF NORMAL FORM
with Chris Date

 

 

 

From: PV

To: Editor

 

I have trouble in understanding the domain-key normal form (DK/NF). Some books that I've read say that there are six normal forms, namely:

 

·         1NF

·         2NF

·         3NF (more specifically, BCNF)

·         4NF (no multivalued dependencies)

·         5NF (project-join normal form)

·         6NF (DK/NF)

 

Of all the normal forms given, DK/NF is the one I find difficult to understand.

 

A relation is said to be in DK/NF if every constraint on the relation is a logical consequence of the definitions of keys and domains. It seems that the definition is too general. What does this mean?

It's also been said that there are no normal forms higher than DK/NF. If a relation has an atomic primary key and is in 3NF, is it automatically in DK/NF?

I've been searching for examples of relations in DK/NF, but it seems that there isn't any.

Can you give me an example of a relation that is in 5NF, but not yet in DK/NF? And the resulting relations when converted into DK/NF?

 

 

Chris Date Responds: Valdez has "trouble understanding ... domain-key normal form (DK/NF)."  I don't blame him; there's certainly been some serious nonsense published on this topic in the trade press and elsewhere.  Let me see if I can clarify matters. 

 

DK/NF is best thought of as a straw man (sorry, straw person).  It was introduced by Ron Fagin in his paper "A Normal Form for Relational Databases that Is Based on Domains and Keys," ACM TODS 6, No. 3 (September 1981).  As Valdez says (more or less), Fagin defines a relvar R to be in DK/NF if and only if every constraint on R is a logical consequence of what he (Fagin) calls the domain constraints and key constraints on R. Here:

 

A domain constraint--better called an attribute constraint--is simply a constraint to the effect a given attribute A of R takes its values from some given domain D.  A key constraint is simply a constraint to the effect that a given set A, B, ..., C of R constitutes a key for R.

 

Thus, if R is in DK/NF, then it is sufficient to enforce the domain and key constraints for R, and all constraints on R will be enforced automatically.  And enforcing those domain and key constraints is, of course, very simple (most DBMS products do it already).  To be specific, enforcing domain constraints just means checking that attribute values are always values from the applicable domain (i.e., values of the right type); enforcing key constraints just means checking that key values are unique.

 

The trouble is, lots of relvars aren't in DK/NF in the first place.  For example, suppose there's a constraint on R to the effect that R must contain at least ten tuples.  Then that constraint is certainly not a consequence of the domain and key constraints that apply to R, and so R isn't in DK/NF. The sad fact is, not all relvars can be reduced to DK/NF; nor do we know the answer to the question "Exactly when can a relvar be so reduced?"

 

Now, it's true that Fagin proves in his paper that if relvar R is in DK/NF, then R is automatically in 5NF (and hence 4NF, BCNF, etc.) as well.  However, it's wrong to think of DK/NF as another step in the progression from 1NF to 2NF to ... to 5NF, because 5NF is always achievable, but DK/NF is not.

 

It's also wrong to say that there are "no normal forms higher than DK/NF."  In recent work of my own--documented in the book TEMPORAL DATA AND THE RELATIONAL MODEL, by myself with Hugh Darwen and Nikos Lorentzos--my coworkers and I have come up with a new sixth normal form, 6NF.  6NF is higher than 5NF (all 6NF relvars are in 5NF, but the converse isn't true); moreover, 6NF is always achievable, but it isn't implied by DK/NF.  In other words, there are relvars in DK/NF that aren't in 6NF.

 

A trivial example is:

 

EMP {EMP#,DEPT#,SALARY} KEY {EMP#}

 

(with the obvious semantics).

 

Valdez also asks:  "If a [relvar] has an atomic primary key and is in 3NF, is it automatically in DK/NF?"  No.  If the EMP relvar just shown is subject to the constraint that there must be at least ten employees, then EMP is in 3NF (and in fact 5NF) but not DK/NF.  (Incidentally, this example also answers another of Valdez's questions:  "Can [we] give "an example of a [relvar] that's in 5NF but not ... in DK/NF?")

 

Note:  I'm assuming here that the term "atomic key" means what would more correctly be called a simple key.  I'm also assuming that the relvar in question has just one key, which we might harmlessly regard as the "primary" key.  If either of these assumptions is invalid, the answer to the original question is probably "no" even more strongly!

 

The net of all of the above is that DK/NF is (at least at the time of writing) a concept that's of some considerable theoretical interest but not yet of much practical ditto. The reason is that, while it would be nice if all relvars in the database were in DK/NF, we know that goal is impossible to achieve in general, nor do we know when it is possible.  For practical purposes, stick to 5NF (and 6NF).  Hope this helps!

 

 

 

[ABOUT] [QUOTES] [LINKS]