From: PV
To: Editor
I have trouble in understanding the domain-key normal form
(DK/NF). Some books that I've read say that there are six normal forms, namely:
·
1NF
·
2NF
·
3NF (more specifically, BCNF)
·
4NF (no multivalued dependencies)
·
5NF (project-join normal form)
·
6NF (DK/NF)
Of all the normal forms given, DK/NF is the one I find
difficult to understand.
A relation is said to be in DK/NF if every constraint on the
relation is a logical consequence of the definitions of keys and domains. It
seems that the definition is too general. What does this mean?
It's also been said that there are no normal forms higher than DK/NF. If a
relation has an atomic primary key and is in 3NF, is it automatically in DK/NF?
I've been searching for examples of relations in DK/NF, but it seems that there
isn't any.
Can you give me an example of a relation that is in 5NF, but not yet in DK/NF?
And the resulting relations when converted into DK/NF?
Chris Date Responds: Valdez has "trouble
understanding ... domain-key normal form (DK/NF)." I don't blame him; there's certainly been
some serious nonsense published on this topic in the trade press and
elsewhere. Let me see if I can clarify
matters.
DK/NF is best thought of as a straw man (sorry, straw
person). It was introduced by Ron Fagin
in his paper "A Normal Form for Relational Databases that Is Based on
Domains and Keys," ACM TODS 6, No. 3 (September 1981). As Valdez says (more or
less), Fagin defines
a relvar R to be in DK/NF if and only if every constraint on R is
a logical consequence of what he (Fagin) calls the domain constraints
and key constraints on R. Here:
A domain constraint--better called an attribute
constraint--is simply a constraint to the effect a given attribute A of R takes
its values from some given domain D. A
key constraint is simply a constraint to the effect that a given set A, B, ...,
C of R constitutes a key for R.
Thus, if R is in DK/NF, then it is sufficient to
enforce the domain and key constraints for R, and all constraints
on R will be enforced automatically.
And enforcing those domain and key constraints is, of course, very
simple (most DBMS products do it already).
To be specific, enforcing domain constraints just means checking that
attribute values are always values from the applicable domain (i.e., values of
the right type); enforcing key constraints just means checking that key values
are unique.
The trouble is, lots of relvars aren't in DK/NF in the first
place. For example, suppose there's a
constraint on R to the effect that R must contain at least ten
tuples. Then that constraint is
certainly not a consequence of the domain and key constraints that apply to R,
and so R isn't in DK/NF. The sad fact is, not all relvars can be reduced
to DK/NF; nor do we know the answer to the question "Exactly when can
a relvar be so reduced?"
Now, it's true that Fagin proves in his paper that if relvar R
is in DK/NF, then R is automatically in 5NF (and hence 4NF, BCNF, etc.)
as well. However, it's wrong to think
of DK/NF as another step in the progression from 1NF to 2NF to ... to 5NF,
because 5NF is always achievable, but DK/NF is not.
It's also wrong to say that there are "no normal forms
higher than DK/NF." In recent work
of my own--documented in the book TEMPORAL DATA AND THE RELATIONAL
MODEL,
by myself with Hugh Darwen and Nikos Lorentzos--my coworkers and I have come up
with a new sixth normal form, 6NF.
6NF is higher than 5NF (all 6NF relvars are in 5NF, but the converse
isn't true); moreover, 6NF is always achievable, but it isn't implied by
DK/NF. In other words, there are
relvars in DK/NF that aren't in 6NF.
A trivial example is:
EMP {EMP#,DEPT#,SALARY} KEY {EMP#}
(with the obvious semantics).
Valdez also asks:
"If a [relvar] has an atomic primary key and is in 3NF, is it
automatically in DK/NF?" No. If the EMP relvar just shown is subject to
the constraint that there must be at least ten employees, then EMP is in 3NF
(and in fact 5NF) but not DK/NF.
(Incidentally, this example also answers another of Valdez's
questions: "Can [we] give "an
example of a [relvar] that's in 5NF but not ... in DK/NF?")
Note: I'm assuming
here that the term "atomic key" means what would more correctly be
called a simple key. I'm also
assuming that the relvar in question has just one key, which we might
harmlessly regard as the "primary" key. If either of these assumptions is invalid, the
answer to the
original question is probably "no" even more strongly!
The net of all of the above is that DK/NF is (at least at the
time of writing) a concept that's of some considerable theoretical interest but
not yet of much practical ditto. The reason is that, while it would be nice if
all relvars in the database were in DK/NF, we know that goal is impossible to
achieve in general, nor do we know when it is possible. For practical purposes, stick to 5NF
(and
6NF). Hope this helps!
[ABOUT]
[QUOTES]
[LINKS]