MORE ON TYPE INHERITANCE
with Chris Date

 

 

 

I have just read--with great interest--your articles on substitutability, and intrigued by them, the original Is a Circle an Ellipse?And I have this funny notion, that your attack is much broader than you portray it.

 

First of all, let me praise you for the wonderful, enlightening distinction between values and variables. Indeed, object-speak gets confusing there sometimes, when one tries to deal with the fundamentals. But it seems that you fail to push the distinction far enough. Let me also say that my comments are based on your article, and I should probably read THE THIRD MANIFESTObefore saying anything about your inheritance model; however, I think that even if I got things wrong, it may help you explain your position better. Now, to the point:

 

Circles, of course, are Ellipses. And yet, Circles are not Ellipses. How can I claim this contradiction with a straight face? Easy, each of the two horns of the dilemma is true in a different context. Circle values are Ellipse values. Circle variables are not Ellipse variables--because an Ellipse variable can hold a value that is strictly an ellipse (i.e. not a circle), and a Circle variable cannot. In the article, you explicitly say as much.

 

The thing that probably upsets so many people when reading your article is, I think, that you propose a model of inheritance that is incompatible with their prior expectations; where there are update operations that are "not unconditionally inherited". This means that what you call (in another place, I think) "inclusion polymorphism" doesn't work for update operations in your inheritance model. The consequence is that you get the benefits of inheritance – guaranteed behavior of subtype entities--only inasmuch as these entities are values, and not variables; in other words, only if your program is purely functional.

 

A main objective of the inheritance model used in C++ is exactly the point you so easily discard: To make sure that if $a$ is a variable of type $A$, $b$ is a variable of type $B$, and $A$ is a subtype of $B$, any operation that applies to $b$ also applies to $a$. Without this guarantee, you cannot write polymorphic updaters: You are forced to choose between the ability to pass variables by reference, and the ability to rely on inheritance.

 

Now, when I combine this with what you've written in other places on object ids and pointers, it would seem that passing variables by reference is also not your cup of tea--since it is essentially equivalent to pointers. But without it, the only update operations can be assignments; the only side effects can be assignments to global variables. The attack, then, seems not only on standard OOP, but also on almost all procedural programming.

 

Most people like writing programs with side effects, and the OO crowd even more so. And most people like their type systems hierarchic, rather than flat. Your inheritance model--at least as far as it is sketched in the article--does not offer a way to realize these two preferences simultaneously. Stroustroup's model does. The price he pays for this is in mathematical rigor--you may have it (as I've said, I still need to read the MANIFESTO), Stroustroup doesn't. But the feature your model misses is not negligible.

 

 

Chris Date Responds: It was a pleasure to receive your message.  It's very heartening to find that some people do read things carefully and do bring an analytical focus to bear on what they read.  I'm particularly pleased to hear that you find the value vs. variable distinction "wonderful" and "enlightening"!  Hugh Darwen and I find that distinction to be amazingly useful in our continuing--and never easy--struggle (a) to think clearly and (b) to express the results of our thought processes clearly (when we think those results might be worth the effort, which isn't always). 

 

I want to make four points in response to your observations.  The first is just a matter of clarification; the next two are elaborations on issues where we might be in some disagreement (I'd like to persuade you, if I can, that any such disagreement might be more apparent than real); the last has to do with something that I think is very important but wasn't touched on in my original article. 

 

1. "[An] Ellipse variable can hold a value that is strictly an ellipse ... and a Circle variable cannot.  In the article, you explicitly say as much."

 

Quite right.  I just want to add that I believe this position to be utterly noncontroversial.  Even in C++, for example, I don't think anybody wants to be able to assign a value that is "strictly an ellipse" to a variable that is declared to be of type Circle.  Thus, I don't think anybody wants the operation "assign an ellipse value" to be inherited unconditionally by variables of declared type Circle (Just as--to take a more mundane example--I don't think anybody wants to be able to assign a value that is "strictly an integer" (and hence possibly an even integer) to a variable that is declared to be of type OddInteger.) Since assignment is the only update operator that's logically necessary, then, what can "unconditional inheritance of update operators" possibly mean?

 

2. "[You] get the benefits of inheritance--guaranteed behavior of subtype entities--only ... if your program is purely functional ... [You] cannot write polymorphic updaters." 

 

If I understand your remarks here correctly, then I don't agree with you.  A major objective of inheritance as I understand it is code reuse.  (I also think code reuse is what you're referring to when you talk about what you call "a main objective of the C++ model."  I don't fully agree with the way you characterize that objective, but it's not worth pursuing that point here.)  So consider a program P that's written to operate on ellipses, and suppose for the sake of the discussion that type ELLIPSE is a leaf type (type CIRCLE hasn't been defined yet).  Presumably P will contain some variable, E say, of declared type ELLIPSE; for generality, what's more, let's assume that P performs various updates on E.  Obviously, everything works fine so long as ELLIPSE remains a leaf type.  But what happens when CIRCLE is defined as a proper subtype of ELLIPSE?  Can P be "reused" without change on circles that aren't ellipses? 

 

In our model, the answer to this question is yes.  The key to understanding this answer is tucked away in a short bulleted paragraph near the end of my original article.  (The paragraph in question was admittedly almost an aside, and I'm not surprised if you didn't pick up on it or realize its significance.)  This is that paragraph: 

 

“Let E be a variable of declared type ELLIPSE.  Then updating E in such a way that a = b after the update means the most specific type of the current value of E is now CIRCLE.  Likewise, updating E in such a way that a > b after the update means the most specific type of the current value of E is now "just ELLIPSE." 

 

We refer to the effects mentioned in the second and third sentences here as specialization by constraint and generalization by constraint, respectively (and abbreviate them as S by C and G by C).  Thus, if we invoke program P and pass it a circle instead of "just an ellipse," everything works just fine!--thanks to S by C and G by C.  To be more specific, all operations that worked on variable E before still work, including update operations, even if E contains a value of most specific type CIRCLE at run time.  What's more, we achieve this desirable result (a) without producing noncircular circles or similar “nonsenses”, and (b) without sacrificing the crucial ability to define what we call type constraints.  (Just as an aside, I note that the SQL standard "SQL:1999" does produce noncircular circles and does sacrifice the ability to define type constraints.) 

 

Our book FOUNDATION FOR FUTURE DATABASE SYSTEMS: THE THIRD MANIFESTO identifies several additional benefits of S by C and G by C.  Furthermore, we believe--pace much of the OO literature and conventional wisdom--that S by C and G by C are capable of efficient implementation; some thoughts on this issue can also be found in that same book. 

 

3. "[It] would seem that passing variables by reference is not your cup of tea--since it is essentially equivalent to pointers." 

 

We have no problem with passing variables by reference.  (Actually, I would prefer to say arguments, not variables, though I agree that those arguments must be variables specifically if they're subject to update.)  We see "pass by reference" merely as an implementation mechanism that can be used to achieve certain semantics that are prescribed by our model.  (As a matter of fact, the THIRD MANIFESTO book mentioned above actually states that "pass by reference" is the appropriate implementation to be used in such cases, though we now realize that alternative and possibly superior implementations are possible as well.) 

 

That said, I must now add that we certainly do object to "pass by reference" if it is a feature of the model; indeed, that's exactly one of the disagreements we have with the OO world.  We don't want application programmers (or end-users) to have to deal with pointers at all--not even if those pointers are only implicit (in some ways, in fact, it's worse if they're implicit).  In a word, we don't want "REF types" in our model. 

 

4. I'd like to add that we also believe that (a) "REF types" (or pointers, or object IDs, or whatever you want to call them) and (b) a good model of inheritance are logically incompatible.  To elaborate briefly: 

 

·   First, we claim that an inheritance model is "good" only if it does support S by C and G by C.

 

·   Second, the existence of REF types means that S by C and G by C can't be made to work.

 

This position too is explained in detail in the MANIFESTO book. 

 

I hope the foregoing helps to clarify our position.  I certainly hope too that you will read the MANIFESTO book as you say you will, and I welcome any thoughts you might have to offer on the ideas contained therein.

 

 

Posted 08/18/02

 

 

 

[ABOUT] [QUOTES] [LINKS]