ON AUTO-INCREMENTING SURROGATE KEY GENERATION
with Fabian Pascal and Chris Date

 

 

 

From: PV

To: Editor

 

Is it all right in general to use auto-incrementing fields as primary keys?

 

Many DBMS products support system-generated auto-incrementing integer fields. That's why when I create table of, say, persons, I always create an auto-incrementing field and then make it the primary key, instead of making the combination of the last name, first name, and middle name fields the primary key. I find it convenient to use a single field rather than a combination of fields as the primary key. Furthermore, system-generated keys are guaranteed to be unique by the DBMS.

 

But the problem is, each DBMS implementation seems to have its own way of declaring auto-incrementing fields.

 

In MySQL, you have to issue the statement

 

CREATE TABLE persons

(personid INT PRIMARY KEY

 AUTO_INCREMENT, ...)

 

while in SQL Server, you have to issue

 

CREATE TABLE persons

(personid INT PRIMARY KEY

 IDENTITY, ...)

 

In PostgreSQL, you have to do

 

CREATE TABLE persons

(personid SERIAL PRIMARY KEY,

 ...)

 

and in Interbase, you must issue a sequence of statements like this:

 

CREATE TABLE persons

(personid INT  NOT NULL PRIMARY KEY,

 ...)

CREATE GENERATOR personid_gen

CREATE TRIGGER gen_personid FOR persons

 BEFORE INSERT POSITION 0

 AS BEGIN

     NEW.personid = GEN_ID(personid_gen,1);

    END

 

Why is the declaration of auto-incrementing fields not standard among different DBMS products? It seems that it's not even in the SQL standard.

 

Because of this, I have difficulty in porting my SQL scripts from SQL Server into Interbase. Must a true RDBMS implementation support the use of auto-incrementing fields?

 

 

From: Fabian Pascal

To: PV

 

We may respond more thoroughly to your message on the site, but that may take a while.

 

The short answer is that the relational model requires each table to have a key, but key selection is a pragmatic, not theoretical issue. The chapter on keys in PRACTICAL ISSUES IN DATABASE MANAGEMENT discusses the four pragmatic criteria for selecting keys and explains that often they are in conflict with one another and the optimal tradeoff between them varies by database. In the case of composite keys, familiarity can sometimes be traded off for simplicity and, therefore system-generated surrogate keys may be appropriate.

 

There is no reason why DBMSs should not support auto-incrementing key generation, or why that is not part of the standard, this is a commercial implementation, not relational issue.

 

 

Ed. Note: Here is a quote from What First Normal Form is Not, Paper #2 in our new DATABASE FOUNDATIONS SERIES. It refers to the system-generated ITEM-IDs by multivalue database systems, but it applies equally to relational keys: “As far as we can tell, they do not guarantee logical uniqueness: It may be possible to insert two logically identical ITEMs in a MV file--the MV file manager will generate two ITEM-IDs for them and, thus, consider them unique (this is one reason we frown on system-generated surrogate keys for every table in relational databases).”

 

 

Chris Date Responds: I think you have summarized the pros and cons very well yourself.  In principle, what you call “auto-incrementing [key] fields” are a nice idea; in fact, Hugh Darwen and I very strongly suggest support for such a feature in our book THE THIRD MANIFESTO. The problem is, however (as you observe), that no such feature appears in the SQL standard, and market support for it is either proprietary, or nonexistent. So it’s your call: portability vs. convenience—what is more important to you? (See also my chapter Composite Keys in RELATIONAL DATABASE WRITINGS 1989-1991.)

 

Posted 06/20/03

 

 

 

[ABOUT] [QUOTES] [LINKS]