From: RI
To: Editor
Date: 29 Sep 2004
I found your article More on Cure For Madness
and indeed the original A Cure For Madness,
an interesting read, that has got me thinking on the points you raised. However
I find myself in disagreement with you which is why I am writing.
In your article you provide a justification for a sequenced
WHERE clause ( the : operator ) that I'm afraid I don't find very compelling,
at least with regard to the examples you gave. Your justification for it seems
to be:
1) It solves the problem of arbitrary re-ordering of WHERE
clauses by the implementation, some of which cause run-time errors and some of
which do not.
2) It's a shorthand for redefining attributes of a relation
to their sub-types while restricting rows to only those sub-types.
You give an example:
s WHERE THE_R ( E ) > LENGTH ( 2.0 ) AND IS_CIRCLE ( E )
which you correctly point out is a syntax error, but what
about transforming it to:
s WHERE THE_R (
TREAT_DOWN_AS_CIRCLE( E )) > LENGTH ( 2.0 ) AND IS_CIRCLE ( E )
This, as I understand it from you’re AN INTRODUCTION TO DATABASE
SYSTEMS, 7th Ed. is legitimate syntax. One that doesn't change the
result type (any of the attributes) of the expression, unlike your proposed use
of ':'.
It looks like it still has a problem with a possible runtime
error if the first part of the AND is executed first, but is this really a
legitimate concern? It seems to me it should be possible to convert any
predicate, such as the one in this where clause, into a relation such that the
predicate becomes a test for membership in the relation. Haven't you stated
something like this somewhere? A test for membership cannot possibly generate a
run-time error so why should it be acceptable for an implementation to chose an
execution path for a predicate that does give one?
Whilst I wouldn't dream of forcing implementations to always
do such a transformation, I do believe the logical model could define
evaluation of predicates in terms of relations. By doing so would this not
result in evaluation strategies in which runtime errors are unlikely if not
impossible?
Take something even simpler than your example:
s{A,B} where A > 0 and B / A = 2
Leaving the meaning to arbitrary interpretation by the
implementation gives scope for a divide by zero error. Consider the expression
can be transformed to:
s{A,B} where A > 0 and B = 2 * A
This is a legitimate transformation that has no scope for an
error so it is unreasonable for one to occur in any actual implementation of
the original.
Your example using ELLIPSE and CIRCLE could be re-arranged as:
s WHERE THE_A ( E ) > LENGTH ( 2.0 ) AND IS_CIRCLE ( E )
without altering its meaning but eliminating the runtime error.
I'm not blind to the difficulty of what I propose. I don't
know how to do such transformations for all possible predicates as user defined
operators complicate things enormously, but it seems to me to be a worthy goal.
Getting back to your second justification for the : operator
(converting/restricting attributes types). This also seems unneccessary to me
because Tutorial D already allows:
(extend ((s where IS_CIRCLE(E)) rename E as X)
add TREAT_DOWN_AS_CIRCLE(X) as E) { ALL BUT X }
OK, it’s longwinded, but that's a fault of the existing
syntax. Its hardly a good excuse for adding more complexity to the language.
I hope this has made some sort of sense.
C. J. Date Responds:
It's good to see someone paying such careful attention! I just have a few points I'd like to make in
response:
1. Our syntax R:IS_T(A) is defined to be shorthand for an expression
somewhat similar to the one you quote:
( R WHERE IS_T ( A ) ) TREAT_DOWN_AS_T
( A )
2. You say that [the fact that the longhand version is
longwinded] is "hardly a good excuse for adding more complexity to the
language." But shorthands don't
add complexity--they reduce it! See the
remarks regarding syntactic substitution in the annotation to reference [35] in
FOUNDATION FOR FUTURE DATABASE
SYSTEMS: THE THIRD MANIFESTO.
3. Your suggestion to the effect that we could avoid the
problem under discussion in the ellipses-and-circles example by using THE_A
instead of THE_R is valid, of course, and I have some sympathy with it. In general, however, I think such a scheme
would be "unnatural" and awkward and hard to use. Suppose we replace types ELLIPSE and CIRCLE
by types PENTAGON and REGULAR_PENTAGON, respectively (with the obvious
semantics); let attribute P of relvar R be of type PENTAGON; and consider the
following example:
R:IS_REGULAR_PENTAGON(P) WHERE THE_CTR(P) = POINT(0.0,0.0)
I would hate to have to follow your suggestion in this
particular example.
Posted 2/4/05