From: TK
To: Editor
Date: 10 Jan 2005
In the book WHAT
NOT HOW - THE BUSINESS RULES APPROACH TO APPLICATION DEVELOPMENT, you
provide the following examples to illustrate domain and table constraints,
respectively:
Domain constraint example
TYPE QTY
POSSREP { Q INTEGER WHERE
Q > 0 AND Q < 5001 } ;
Table constraint example (shorthand version)
S.STATUS > 0 AND S.STATUS <= 100
Since each of these ultimately constrains values for the
columns upon which they are defined in a similar way, what is the criteria one
would use in order to decide which type of constraint to use?
My own thinking is that if a particular numeric column is one
to which mathematical operations would commonly be applied, the column type
should be the numeric type and the valid range constraint should be expressed
as a table constraint. On the other hand, if the column is one to which
computations would not normally be applied, a domain should be defined (to
include the valid range) and the column should be constrained to be of that
type. Note that this is exactly opposite the examples from your book above. My
reasoning is that since a domain requires an "extra" operation (THE_...
operator) to get to the underlying value, one is less likely to attempt
inappropriate computations on it.
C. J. Date Responds: Your question, slightly
rephrased, is: What are the criteria
for deciding when to use a domain constraint and when a table constraint? (Since you use the terms domain and table,
I'll stick with them myself in what follows.)
In general, I don't think we yet have a good answer to this
question (just like we don't yet, in my opinion, have a good answer to the
majority of database design questions!).
But there are a couple of points that can usefully be made—though I
should admit right up front that they're pretty obvious ones. Anyway, here they are, for what they're
worth.
First, if what's effectively the same constraint applies to
several distinct columns, then we're probably talking about a domain
constraint. As a simple example,
suppose columns A, B, C, ..., Z all contain integers, but they're all subject
to the constraint that the integers in question mustn't be negative. OK:
Define a domain (type), thus:
TYPE NON_NEG_INTEGER
POSSREP ( I INTEGER
WHERE NOT ( I < 0 ) ) ;
And then define each of the columns A, B, C, ..., Z to be of
this domain (type).
Second, you say:
"[If] a particular numeric column is one to which mathematical
operations would commonly be applied, the column should be the numeric
type." The problem with this
suggestion is that it isn't very type-safe.
If you define column WEIGHT and column QUANTITY both to be of type
(domain) NUMERIC, then you make nonsensical expressions like WEIGHT + QUANTITY
legal. I think it's better to give them
different domains, WT and QTY say. And
if you want to allow, say, "*" (though not "+") to apply to
a value of type WT and a value of type QTY, state that fact explicitly:
OPERATOR "*" ( W WT, Q QTY ) RETURNS WEIGHT ;
RETURN WEIGHT ( THE_NUM
( W ) * ( THE_NUM ( Q ) ) ;
END OPERATOR ;
(I'm assuming here that types WT and QTY both have a possrep
with a single component, called NUM, of type NUMERIC. Note, incidentally, how the implementation of "*" for
weights and quantities here is effectively being delegated to the
implementation of "*" for numbers.)
I'm sure there's much more that could be said about these
matters, but I'm waiting for someone to carry out a decent investigation into
the topic; so far as I know, the field is pretty much wide open.
Posted 3/11/05