MORE ON NORMALIZATION
with Fabian Pascal

 

 

 

From: JM

To: Editorial

Date: 24 Feb 2004

 

As I attempt to educate myself on the intricacies of normalization I have stumbled in the dark on points I trust you can illuminate:

 

1. 1NF and repeating groups.

 

Consider the relation: R{StoreCode,EmployeeID1,EmployeeID2,...,EmployeeIDN}

 

Now, I was too once prone to error of thinking the repeated attributes were the same as repeating groups and therefore the above would fail to meet 1NF requirements.

 

However, after reading your article More On Repeated Groups and Normalization, I see that the relation does indeed meet 1NF requirements.

 

Can we not say that it may meet the letter but violate the spirit?  Isn't the above simply a misguided attempt to implement repeated groups, and on those grounds it fails 1NF?

 

How can I use the normal forms to illustrate that the above relation is a poor design choice?  Or should it not be a poor choice, and my thinking is limited by my tools?

 

---

2. 2NF and nonsensical relations.

 

I always see 2NF illustrated in a relation with a key comprised of two (or more) attributes.

 

However, I ask you to consider the following nonsensical relation:

 

R{StoreCode,EmployeeShirtSize}

 

StoreCode is the key here. Assuming that EmployeeShirtSize is indeed not functionally dependent on StoreCode--and therefore not fully functionally dependent--then the above relation is not in 2NF. It's nonsense, I know, but there's a lot of nonsense out there and I don't want to perpetuate any.

 

Doesn't this refute the somewhat common assertion that "entities of one or two attributes cannot contain any of the normal form errors" and therefore "are in the highest normal form"?  (here I quote The Folklore of Normalization by Robert Buelow, but variations of the assertion abound.)

 

Or does 2NF cover nonsensical relations, and for simplicity's sake we ignore them for assertions like the above?

 

3. 3NF and calculated fields.

 

Consider the relation: R{Item,Cost,Qty,Total}, where Total is the product of Cost and Qty.

 

So, we have a calculated field and my understanding is this violates 3NF because of a transitive dependency: Item --> (Cost, Qty) --> Total

 

I haven't found a place where this is explicitly noted, so I turn to you for validation or demolition of my reasoning.

 

 

Fabian Pascal Responds: Re 1NF, I suggest you read our DATABASE FOUNDATIONS papers #1, What First Normal Form Really Means, and #2, What First Normal Form Means Not. Previous understanding of 1NF was incorrect. We redefine and apply it in the two papers.

 

Re 2NF-5NF, I also suggest you read our just published DATABASE FOUNDATIONS paper #6, The Costly Illusion: Normalization, Integrity and Performance.

 

To your specific questions:

 

1. This kind of design is usually a bad idea, but for reasons other than unnormalization (I discussed them in my second book, UNDERSTANDING RELATIONAL DATABASES).  It is probably due to repeating group thinking (see On Normalization and Repeating Groups).

 

2. If EmployeeShirtSize is not functionally dependent on StoreCode, then StoreCode is not the key!!! By definition, a key is an attribute (or combination of attributes) on which all other attributes are functionally dependent. In other words, it is not the relation that is nonsensical, but your premise that is internally inconsistent.

 

3. You are correct in principle, however calculated/derived attributes are a form of redundancy and, therefore, it is not a good idea to store them in the database anyway. For different kinds of redundancy, some unrelated to normalization, see the chapter Not Worth Repeating in my PRACTICAL ISSUES IN DATABASE MANAGEMENT).

 

 

 

Posted 04/23/04