ON SINGLE-ROW CONSTRAINTS AND DATA TYPES
with Chris Date

 

 

 

From: TH

To: Editor

 

I am aware of Chris Date's classification of constraints into those applying to types, attributes, relations, and databases.  I am interested in commentary on a further classification, that which I call tuple constraints. A tuple constraint is a constraint that applies either to a solitary tuple, or to each tuple in a relation without reference to other tuples in the relation.

 

I have been thinking of this category in connection with a related issue: How do I know when a set of attributes in a relation really should be treated as the components of a type, and not merely as attributes of the relation?  As an example, consider a relation listing the hire date of an employee:

 

VAR hired RELATION

{emp# emp#,

 hire_year INTEGER,

 HIRE_MONTH INTEGER,

 HIRE_DAY INTEGER}

KEY {EMP#}

 

I could conceivably formulate a rather extensive constraint that requires HIRE_YEAR, HIRE_MONTH, and HIRE_DAY to form a valid Gregorian date.  For instance, I could require that, if HIRE_MONTH is 2 (February), HIRE_DAY be less than 29, unless HIRE_YEAR is a leap year.  Such a constraint applies to each tuple in the relation without reference to any other tuples in the relation: it is a tuple constraint.

 

However, I know intuitively that the three attributes HIRE_YEAR, HIRE_MONTH, and HIRE_DAY are actually three components of a possible representation of type DATE (specifically, the Gregorian representation).

 

My questions, therefore, are:

 

1.      By what formal criteria may I assert that these three attributes should appear in this relation as a single attribute of type DATE?

2.      Is a classification of constraints as tuple constraints a legitimate and useful classification, or might such a classification rather serve to signal that a set of attributes in a relation should be treated as components of a type?

 

 

Chris Date Responds: If I understand you correctly, I think what you're trying to do is come up with some guidelines, if not formal principles, to help answer the question:  When I'm doing logical database design, what domains--I prefer the term types myself, as I think you do too--do I need? (Presumably you're talking about a system in which we're not limited to built-in types but can define our own.)

 

I also understand you to be suggesting that if we find that a certain "tuple constraint" interrelates certain attributes A1, A2, ..., An of some relvar in some way, then we might be better off replacing that set of attributes by a single attribute defined over some type T that has a possible representation ("possrep") whose components map one-to-one to those attributes A1, A2,..., An.

 

I find myself in some sympathy with this suggestion.  I also have to say that, sadly, I'm not aware of any formal work that has been done on this issue.  Deciding types has always seemed as difficult to me as deciding entities!  (I hasten to add, probably unnecessarily, that the two issues are certainly not one and the same.)  As far as I'm concerned, therefore, the field is wide open for new contributions...

 

Regarding tuple constraints more generally:  As a matter of fact, I used to include such constraints in my own classification scheme, before settling on the scheme that I now support (which you're familiar with).  It's interesting to note that the SQL standard used to call out tuple constraints as a special case too (they were the only ones it supported, apart from primary and foreign key constraints, prior to SQL:1992).  It's also interesting to note that various products, including earlier releases of (I believe) both Ingres and DB2, adopted very much the same position.  However, I believe the standard and products were originally that way for implementation, not model, reasons (it's clear that tuple constraints are much easier to implement than constraints in their full generality).  From a logical point of view, however, I see tuple constraints as just a simple special case of relvar constraints in general.  It might be pragmatically useful from the point of view of the user and/or the DBA and/or the DBMS to call out that special case and perhaps give it some special syntactic and/or semantic treatment, but (to say it again) I don't see such considerations as a model issue.  Furthermore, if your hunch is right that the existence of certain tuple constraints suggests that we might have done the design wrong (i.e., we should be thinking about a new type instead), then presumably the pragmatic usefulness of such constraints will be diminished (they'll become type constraints instead).

 

So I guess the outstanding question is: How can we precisely characterize those "tuple constraints" that mean we should be thinking about a new type instead?  That looks like a research question to me.

 

 

Ed. Comment: Because the choice of entity-types and domains in conceptual models are based on perceptions of the segments of reality to be represented in the database, I suspect that formalization may not be possible: there are no theoretical justifications for preferring one perception to the other, although there may be pragmatic grounds for such choices. This has proved lately to be also true for primary keys too (see PRACTICAL ISSUES IN DATABASE MANAGEMENT for the pragmatic criteria for choosing a PK from among candidate keys; see also the domain discussion in the first two papers in the DATABASE FOUNDATION SERIES, forthcoming in June.)

 

Posted: 07/11/03

 

 

 

[ABOUT] [QUOTES] [LINKS]