ON CODD'S AND NORMALIZATION RULES
with Fabian Pascal

 

 

 

From: Steve Hartley 

Date: 1 Aug 2005

 

Just wondered if you might be prepared to divulge definitive versions of Codd's "12 rules" and rules of normalisation. I see them everywhere but they are all worded differently.

 

 

From: Fabian Pascal

 

I would not pay too much attention to Codd's rules at this point, they have been deprecated. Codd formulated them as a quick and dirty way to counter all the nonsense that was floating at that early time, they are not orthogonal or systematic, and our understanding of RM has progressed considerably since then.

 

For the current understanding of what the relational model is see (a) Date and Darwen's view, expressed in THE THIRD MANIFESTO; and my (and David McGoveran's) view in the re-launched PRACTICAL DATABASE FOUNDATIONS papers (they are not identical, but possibly compatible) [the first of which is Truly Relational: What It Really Means].

 

Regarding normalization, see (a) Date's AN INTRODUCTION TO DATABASE SYSTEMS 8th. ed., or his latest book DATABASE IN DEPTH – REL. THEORY FOR PRACTITIONERS (b) my paper, The Costly Illusion: Normalization, Integrity and Performance.

 

 

From: Steve Hartley

 

Many thanks for your quick response. I have been struggling for some time to find some accurate guidance. I have previously designed and implemented an insurance system utilising a "multi-value" database (Pick). I am now trying to attain a BCS CITP accreditation which requires me to account for my recent experience and achievements and the underpinning theory behind them. I think that my approach to database design is best summed up by a response of yours which I found on the internet earlier, namely:

 

"The best way to think of this is, informally, that in a fully normalized database every entity-type in the conceptual model maps to exactly one table in the logical database. It is only then that all non-key attributes in every relvar are about the key, the whole key and nothing but the key."

 

The system that I designed fulfilled this criterion and resulted in exceptional ease of data access and maintainability, with major new modules fitting seamlessly into the existing structure.

 

I realise that databases like Pick are often dismissed as unworthy, but I wondered if you could direct me towards an accurate assessment of how far they can be deemed to be fully normalized (ref your response above), and how I could define my design as following specific database rules, in order to allow me to achieve my CITP accreditation. Additionally I have heard Pick databases being described as multivalued, post-relational etc. Could you point me towards any article or book that addresses any of these issues in an objective way?

 

Once again, I would be grateful for your advice.

 

 

From: Fabian Pascal

 

Multivalued databases do not fulfill the most basic of relational requirements: their files are NOT relations, because they are not in 1NF.

 

See 2-part paper which deals exactly with this issue:

 

WHAT FIRST NORMAL FORM REALLY MEANS

 

WHAT FIRST NORMAL FORM MEANS NOT

 

 

From: Steve Hartley

 

hope you don't mind another response though! It is difficult to get an objective view about multivalued databases and people are often passionately for or against them. I'm guessing that you are in the latter camp. Nevertheless, they are databases, and your site is about dispelling persistent prevalent database management fallacies, without apparently discriminating against specific database types. As I mentioned earlier, I am looking for a way to give an "academic" justification for what I have developed. I wasn't in a position at the time to tell my employers that they had got the wrong type of database and should throw it out. I worked with it and applied what I felt was good practice to it. It appears that people are willing to debate the various data storage methods that led to the advent of relational databases, and relational databases themselves, yet dismiss multivalued data storage as if it never existed. I believe that there is a possibility for objective analysis and an examination of how far good practice can make a success of multivalued data storage in the business world, where reality sometimes dictates that you have to work with what you have been given. I used your quote "in a fully normalized database every entity-type in the conceptual model maps to exactly one table in the logical database" in my last email. I think that it demonstrates that certain rules of data management can be common to both relational and multivalued databases.

 

I am not a mathematician, but I am very keen to learn as much as possible about database management. I have come to ask your advice as a respected figure because no one will talk objectively about the type of database I have used. I am intent on learning about true RDBMS structure as well but do still wish to be able to provide a structured and technical analysis of the design of my database. I have this morning bought "An Introduction to Database Systems" as you recommended, and intend to continue my self-education in that respect, but I would dearly love to be pointed in the right direction to help me "compare and contrast" the two types of database.

 

 

From: Fabian Pascal

 

Sorry, but you're going the wrong way about it and this will never work properly. You first developed a database without proper knowledge of fundamentals, using a nonrelational product, and now you want to postfit a scientific rationalization for it. That is why you don't know how to give it the correct justification, and why you cannot see through the arguments for and against.

The only correct way to proceed correctly is to first learn the fundamentals, then select the methods and products you work with. No way around that.

IOW, you came to be too late for your developed project, hope your future ones will be better.

 

 

From: Steve Hartley

 

I didn't mean to upset you. One of the central points that I have made is that it is very difficult to get anyone to rationally discuss the points for and against. I am trying to be objective, not rude, but certain realities just won't go away. Multi-valued databases do exist. They are databases. In my case I did apply rules which appear to be as relevant for relational as for multi-valued databases, as justified by your own quotation which I supplied earlier. My project was also a great success, and is in use nationally at this moment. As I stated earlier, I realise that there is a great deal of passion between the relational and multivalued database camps, but I do not want to take part in that. I communicated with you to see if you could give me advice about whether anyone has ever undertaken a scientific comparison of the two types from a purely objective standpoint.

 

If that is the case then I would be very grateful if you could direct me to it.

 

 

From: Fabian Pascal

 

I am not upset at all. I don't think you understood me.

The two papers I pointed to explain why MV files and products are not relational. With all due respect to your efforts, there is nothing much you can do if the databases and DBMS are not relational and violate the most basics of relational principles.

There is no way to postfit relational benefits to MV, and no proper comparison can do that. What is more, you will not be able to judge whether what you are reading is correct or not, whether it's passionate or not. The only way to avoid passion is via knowledge and reason.

To repeat, without learning the fundamentals, you're flying blind, and there is NO way around it. Just relying on one of my quote is not enough.

 

 

Posted 10/21/05