MORE ON "A CURE FOR MADNESS"
with Hugh Darwen and C. J. Date

 

 

 

From: ES

To: Editor

Date: 15 Feb 2005

 

A thought on the recent discussions A Cure for Madness and More On a Cure for Madness. Is there in fact a need for defining an extra (shorthand) operator?

 

The proposed ':' operator has at least some vague resemblance to "optimization hints from the user to the system", because it tells the system what expressions (e.g. IS_CIRCLE) must be evaluated before which others (e.g. THE_R).

 

The thing is, there should not be any need at all for the user to specify this to the system, as the system can, and should, already be aware of this, namely by looking at the type hierarchy as it has been defined to the system by that same (community of) user(s).

 

If we take the position that it is legal for the system to assume that the user "knows what he is doing", that means among other things that the system can legally assume that the user is aware of the precise nature of the type system.  In the example : the system can legally assume that the

user *knows* that only circles have a radius, and ellipses do not.

 

If a user then queries (a relation having) attributes of type ellipse, and that user specifies a restriction on "radius", it is therefore legal for the system to derive the assumption that this user is only interested in circles specifically.

 

Meaning : the system could apply the IS_CIRCLE restriction automatically, without the user having to specify it.  No need at all to bother the user with this kind of stuff, and no need to "pollute" the language with additional syntactic sugar such as this ":" operator.  I must say I am with RI here.  An extra operator means extra complexity in the sense that mastering the language involves mastering more operators.  And while it is possible for any user to restrict himself to just a basic set of operators when writing, this is obviously not so when reading what was written (i.e. programs) by someone else.

 

(Remark : this is not to say that operators such as IS_CIRCLE should not be made available to the user.  They are useful and they should be provided. This is only to say that I do not see the reason why the user should be forced to mention this particular kind of operator in the kind of situation (restriction) under discussion.  The fact that ellipses do not have radii, is already known in the catalog.  The user should not be forced to duplicate this information in his queries.)

 

(Second remark : of course I am aware that if a user does not "know what he is doing" in the sense mentioned earlier, then that user might be confronted with unexpected and hard-to-explain results. But is that the DBMS's fault ?)

 

e.g.

 

TYPE PLANEFIGURE ...

TYPE ELLIPSE IS PLANEFIGURE POSSREP (longaxis LENGTH shortaxis LENGTH center POINT)

TYPE CIRCLE IS ELLIPSE POSSREP (radius LENGTH center POINT)

TYPE UNITY_CIRCLE IS CIRCLE POSSREP (center POINT)

TYPE RECTANGLE IS PLANEFIGURE POSSREP (base LENGTH height LENGTH center POINT)

TYPE SQUARE IS RECTANGLE POSSREP (side LENGTH center POINT)

RELATION X (round : ELLIPSE   cornered : RECTANGLE)

RELATION Y (figure : PLANEFIGURE)

 

 

X WHERE THE_RADIUS(round) > 0.75

 

is translated automatically to:

 

X WHERE (IS_CIRCLE(round) AND THE_RADIUS(round) > 0.75)

 

and returns only tuples with a circle value (i.e. just circles, or the more specialised unity_circles) in 'round' for which the radius exceeds 0.75. If I understood chapter 20 (of the introduction book) correctly, then this should not be a problem, since the THE_RADIUS operator really does exist for UNITY_CIRCLES as well, it is only its usage in *assignment* operations that is ruled out.

 

X WHERE THE_RADIUS(round) > 2 OR THE_SIDE(cornered) > 2

 

is translated automatically to:

 

X WHERE (IS_CIRCLE(round) AND THE_RADIUS(round) > 2) OR

(IS_SQUARE(cornered) AND THE_SIDE(cornered) > 2)

 

and returns only tuples with either:

 

·   a circle value in 'round' for which the radius exceeds 2 (of course there would be no unity circles here, since their radius does not satisfy the condition).

·   a square value in 'cornered' for which the side exceeds 2.

 

Y WHERE THE_BASE(figure) > 2

 

is translated automatically to:

 

Y WHERE (IS_RECTANGLE(figure) AND THE_BASE(figure) > 2)

 

and returns only tuples where the figure attribute is either a rectangle or a square whose base (c.q. side) is larger than 2.

 

All that needs to be done is for the language interpreter (whether that is a compiler or a true interpreter is irrelevant) to go and find the "supermost" subtype (of the declared type of the attribute) for which the operator used in the expression exists, and then, conceptually speaking,

replace the operator in the expression with a boolean expression of type (IS_T and operator_invocation_here) where T is the applicable "supermost" subtype found.

 

For such an algorithm to be applicable, it is either required:

 

·   that such a "supermost subtype" must be unique within the type (and within any of its own supertypes).  i.e. SQUARES cannot have a radius, because within the type PLANE_FIGURE, it is already defined for type CIRCLE.

 

·   absent such uniqueness, that the "expression extension procedure" is prepared to find ALL the supermost subtypes for which the operator is valid, and extend the expression to a form (IS_T1 OR IS_T2 OR IS_Tn) and ... (where T1, T2, Tn would be the set of all types found).

 

Which of the two is desirable, I cannot tell, but I observe that, if squares have radii (so to speak), then the semantics should at least still be the same, because otherwise there would be two distinct types of PLANE_FIGURE both having an operator of the same name (THE_RADIUS), the precise meaning of which depends on the particular type of PLANE_FIGURE. This is ruled out (once again, if I understood chapter 20 correctly, of course).

 

(Third remark : this need not affect the declared type of the results, so TREAT_DOWN operators would still be needed.  The only thing "eliminated" would be the need for the existence of the : shorthand.)

 

 

C. J. Date Responds: I don't have time to respond in detail, except to say that I think the paragraph:

 

Which of the two is desirable, I cannot tell, but I observe that, if squares have radii (so to speak), then the semantics should at least still be the same, because otherwise there would be two distinct types of PLANE_FIGURE both having an operator of the same name (THE_RADIUS), the precise meaning of which depends on the particular type of PLANE_FIGURE. This is ruled out (once again, if I understood chapter 20 correctly, of course).

 

is incorrect.

 

Let PF be a declared type PLANE_FIGURE. Does THE_CENTER(PF) mean (IS_ELLIPSE(PF) AND ...) or (IS_RECTANGLE(PF) AND ...) ???

 

 

Hugh Darwen Responds: Regarding the possibility of certain conditions being implied by the use of certain operators in certain comparisons--for example THE_radius(C) > 1.0 being short for IS_CIRCLE(C) AND THE_radius(C) > 1.0, I think the approach is extremely incautious and I would not recommend such language design.

 

Some points that ES does not discuss and therefore might not have considered carefully:

 

1. Using ES's own example, consider relvar Y. How is "Y WHERE THE_center(figure) = POINT ( 0, 0 )" to be evaluated?  I'm assuming that THE_center is not defined for plane figures in general, even if that was not ES's intention. My general point concerns operators that are "overloaded" (i.e., have different semantics for different types of operand), even if ES did not really intend THE_center to be overloaded.

 

2. Would "EXTEND Y ADD THE_radius(figure) AS radius" be legal in ES's approach?  If so, there would presumably have to be an implied restriction and perhaps an implied TREAT (as with our ":") too.  In that case the operation is no longer a true extension even though the operator is still spelled EXTEND.  If, on the other hand, the example is illegal, then we have loss of orthogonality and it has to be explained why THE_radius accepts an argument of declared type PLANEFIGURE in some contexts and not others. Of course, restriction isn't the only place where conditional expressions can be used.  Consider, for example, "EXTEND Y ADD (CASE WHEN THE_radius(figure) > 1.0 THEN 'Yes'; ELSE 'No'; END CASE) AS radius_gt_1".

 

3. Taking ES's argument to the extreme, an expression such as R WHERE X/Y = 10 would be shorthand for (R WHERE Y <> 0) WHERE X/Y = 10, regardless of whether NONZERO is a declared subtype of (e.g.) INTEGER.  Are we to do away with run-time exceptions altogether?

 

4. In any case, ES's longhands are not considered to be sound, in the language design community at large (to the extent that I am familiar with that community).  Note how I carefully wrote my example in point 3, using a nested WHERE rather than AND.  The expressions "R WHERE Y <> 0 AND X/Y = 10" and "R WHERE X/Y = 10 AND Y <> 0" should both throw a zero-divide exception. In general, the system should be free to evaluate the operands of commutative operators such as AND in any order.  Languages that specify, for example, left-to-right evaluation and make the evaluation of Y <> 0 AND X/Y = 10 dependent on the system stopping when Y <> 0 evaluates to false are strongly deprecated.  Date and I certainly wouldn't want Tutorial D to tread in such dangerous waters.  Nor do we want Tutorial D to invite criticism of possibly avant-garde ideas that we consider to be irrelevant to our cause.

 

5. ES does not tell us what the declared type of the relevant attribute is in, for example, "Y WHERE THE_BASE(figure) > 2".  Is it PLANEFIGURE or RECTANGLE?  And what about "Y WHERE THE_BASE(figure) > 2 OR THE_radius(figure) > 2"?  If the answer is just PLANEFIGURE in each case, then I would have to point out the advantage of our ":" operator that is not being obtained in ES's approach.

 

6. <ES wrote> The proposed ':' operator has at least some vague resemblance to "optimization hints from the user to the system", because it tells the system what expressions (e.g. IS_CIRCLE) must be evaluated before which others (e.g. THE_R). </ES wrote>  I emphatically reject the "optimizer hint" characterisation!  My remarks in points 4 and 5 are possibly relevant here. ":" has the effect of specialising the declared type of one of the attributes of its operand.

 

I'm sure I could go on if I had the time, but I hope—sincerely!—that what I've written will suffice to dissuade anybody from pursuing the approach advocated by ES.

 

 

Posted 4/8/05