ON SURROGATE VS. NATURAL KEYS
with Fabian Pascal

 

 

 

From: PR

To: Editor

Date: Oct 6, 2003

 

I just came across your website DATABASE DEBUNKINGS, and am drooling over the content of some of your seminars. Unfortunately, I don't see any scheduled for anywhere near me for the foreseeable future (Washington, D.C. area).

 

I found your website using the Google query "surrogate keys vs. natural keys", which is a topic in one of your seminars. I have also found a good number of other articles about it, all of which fail to address what I believe is the high cost of loss of relational integrity if surrogate keys are used for child tables, as well as for primary tables, and I was hoping you could point me to anything either of you have written, or if you have a short answer to the following:

 

The argument to use surrogate keys for primary entities/tables, rather than natural identifiers, even if presently unique, I find persuasive, unless there are intrinsic reasons to trust that the identifier will never change. (I have copied below Joe Celko's rules for when he will use natural keys). We are now facing in my work domain the idiocy of multiple patient/person identifiers, intended to have been unique but abused, and now there are competing identifiers and continuing confusion, involving how to uniquely identify a person/patient in the military health system (the largest health community in the world, I imagine).

 

Virtually every discussion I have been able to find advocates using surrogate keys, because of the danger that natural keys will change, or will not be unique. No discussions I have been able to find address the cost of using surrogate keys for child entities/tables. If one does, then the primary key of the parent migrates as a foreign key, but not as part of the primary key. This destroys the relational integrity, but the advocates of surrogate keys don't seem to notice, or care-at least, they don't discuss the issue. Am I right that such a practice, which I certainly have observed a lot of, vitiates one of the major purposes and advantages of a relational model?  

 

So, I would be grateful for whatever attention you can give to my quest for information on this topic. Thank you in advance, and I look forward some time to being able to attend a seminar.

 

 

From: Fabian Pascal

To: PR

 

1.      Stay away from Celko!!!!

 

2.      Do not confuse logical and physical levels!!! Performance is determined exclusively at the physical level and has nothing to do with choice of keys, which is a purely logical issue. Now, if you get worse performance with a certain type of key rather than another, that is a product implementation issue and should not be confused with the key choice.

 

3.      We do not recommend the use of surrogate keys as a rule, but only in some rare circumstances and for very specific reasons. Chapter 3 in my book gives ah explanation of the criteria for choosing keys, which also help determining when those special circumstances apply.

 

4.      Too bad you missed my seminar in DC at the beginning of this year. If you can find interested parties in organizing one, I will be glad to do it.

 

 

Posted 01/09/04

 

 

 

[ABOUT] [QUOTES] [LINKS]