I have just read--with great interest--your articles on
substitutability, and intrigued by them, the original Is a Circle an
Ellipse?And I have this funny notion, that your attack is much
broader than you portray it.
First of all, let me praise you for the wonderful,
enlightening distinction between values and variables. Indeed, object-speak
gets confusing there sometimes, when one tries to deal with the fundamentals.
But it seems that you fail to push the distinction far enough. Let me also say
that my comments are based on your article, and I should probably read THE
THIRD MANIFESTObefore saying anything about your inheritance model;
however, I think that even if I got things wrong, it may help you explain your
position better. Now, to the point:
Circles, of course, are Ellipses. And yet, Circles are not
Ellipses. How can I claim this contradiction with a straight face? Easy, each
of the two horns of the dilemma is true in a different context. Circle values
are Ellipse values. Circle variables are not Ellipse variables--because an Ellipse
variable can hold a value that is strictly an ellipse (i.e. not a circle), and
a Circle variable cannot. In the article, you explicitly say as much.
The thing that probably upsets so many people when reading
your article is, I think, that you propose a model of inheritance that is
incompatible with their prior expectations; where there are update operations
that are "not unconditionally inherited". This means that what you
call (in another place, I think) "inclusion polymorphism" doesn't
work for update operations in your inheritance model. The consequence is that
you get the benefits of inheritance – guaranteed behavior of subtype
entities--only inasmuch as these entities are values, and not variables; in
other words, only if your program is purely functional.
A main objective of the inheritance model used in C++ is
exactly the point you so easily discard: To make sure that if $a$ is a variable
of type $A$, $b$ is a variable of type $B$, and $A$ is a subtype of $B$, any
operation that applies to $b$ also applies to $a$. Without this guarantee, you
cannot write polymorphic updaters: You are forced to choose between the ability
to pass variables by reference, and the ability to rely on inheritance.
Now, when I combine this with what you've written in other
places on object ids and pointers, it would seem that passing variables by
reference is also not your cup of tea--since it is essentially equivalent to
pointers. But without it, the only update operations can be assignments; the
only side effects can be assignments to global variables. The attack, then,
seems not only on standard OOP, but also on almost all procedural programming.
Most people like writing programs with side effects, and the
OO crowd even more so. And most people like their type systems hierarchic,
rather than flat. Your inheritance model--at least as far as it is sketched in
the article--does not offer a way to realize these two preferences
simultaneously. Stroustroup's model does. The price he pays for this is in
mathematical rigor--you may have it (as I've said, I still need to read the MANIFESTO),
Stroustroup doesn't. But the feature your model misses is not negligible.
Chris Date Responds: It was a pleasure to receive your
message. It's very heartening to find
that some people do read things carefully and do bring an analytical focus to
bear on what they read. I'm
particularly pleased to hear that you find the value vs. variable distinction
"wonderful" and "enlightening"! Hugh Darwen and I find that distinction to be
amazingly useful in
our continuing--and never easy--struggle (a) to think clearly and (b) to
express the results of our thought processes clearly (when we think those
results might be worth the effort, which isn't always).
I want to make four points in response to your
observations. The first is just a
matter of clarification; the next two are elaborations on issues where we might
be in some disagreement (I'd like to persuade you, if I can, that any such
disagreement might be more apparent than real); the last has to do with
something that I think is very important but wasn't touched on in my original
article.
1. "[An]
Ellipse variable can hold a value that is strictly an ellipse ... and a Circle
variable cannot. In the article, you
explicitly say as much."
Quite right. I just
want to add that I believe this position to be utterly noncontroversial. Even in C++, for example,
I don't think
anybody wants to be able to assign a value that is "strictly an
ellipse" to a variable that is declared to be of type Circle. Thus, I don't think anybody
wants the
operation "assign an ellipse value" to be inherited unconditionally
by variables of declared type Circle (Just as--to take a more mundane
example--I don't think anybody wants to be able to assign a value that is
"strictly an integer" (and hence possibly an even integer) to
a variable that is declared to be of type OddInteger.) Since assignment is
the only update operator that's logically necessary, then, what can
"unconditional inheritance of update operators" possibly mean?
2. "[You]
get the benefits of inheritance--guaranteed behavior of subtype entities--only
... if your program is purely functional ... [You] cannot write polymorphic
updaters."
If I understand your remarks here correctly, then I don't
agree with you. A major objective of
inheritance as I understand it is code reuse. (I also think code reuse is what you're
referring to when you
talk about what you call "a main objective of the C++ model." I don't fully agree with
the way you characterize
that objective, but it's not worth pursuing that point here.) So consider a program P that's
written to operate on ellipses, and suppose for the sake of the discussion that
type ELLIPSE is a leaf type (type CIRCLE hasn't been defined yet). Presumably P will contain
some
variable, E say, of declared type ELLIPSE; for generality, what's more, let's
assume that P performs various updates on E. Obviously, everything works fine so long as
ELLIPSE remains a
leaf type. But what happens when CIRCLE
is defined as a proper subtype of ELLIPSE?
Can P be "reused" without change on circles that aren't
ellipses?
In our model, the answer to this question is yes. The key to understanding
this answer is
tucked away in a short bulleted paragraph near the end of my original
article. (The paragraph in question was
admittedly almost an aside, and I'm not surprised if you didn't pick up on it
or realize its significance.) This is
that paragraph:
“Let E be a variable of declared type
ELLIPSE. Then updating E in such a way
that a = b after the update means the most specific type of the
current value of E is now CIRCLE.
Likewise, updating E in such a way that a > b after the
update means the most specific type of the current value of E is now "just
ELLIPSE."
We refer to the effects mentioned in the second and third
sentences here as specialization by constraint and generalization by
constraint, respectively (and abbreviate them as S by C and G by
C). Thus, if we invoke program P
and pass it a circle instead of "just an ellipse," everything works
just fine!--thanks to S by C and G by C.
To be more specific, all operations that worked on variable E before
still work, including update operations, even if E contains a value of
most specific type CIRCLE at run time.
What's more, we achieve this desirable result (a) without producing
noncircular circles or similar “nonsenses”, and (b) without sacrificing the
crucial ability to define what we call type constraints. (Just as an aside, I note that the
SQL
standard "SQL:1999" does produce noncircular circles and does
sacrifice the ability to define type constraints.)
Our book FOUNDATION FOR
FUTURE DATABASE SYSTEMS: THE THIRD MANIFESTO identifies several additional
benefits of S by C and G by C.
Furthermore, we believe--pace much of the OO literature and
conventional wisdom--that S by C and G by C are capable of efficient
implementation; some thoughts on this issue can also be found in that same
book.
3. "[It]
would seem that passing variables by reference is not your cup of tea--since it
is essentially equivalent to pointers."
We have no problem with passing variables by reference. (Actually, I would prefer
to say arguments,
not variables, though I agree that those arguments must be variables
specifically if they're subject to update.)
We see "pass by reference" merely as an implementation
mechanism that can be used to achieve certain semantics that are prescribed by
our model. (As a matter of fact, the THIRD
MANIFESTO book mentioned above actually states that "pass by
reference" is the appropriate implementation to be used in such cases,
though we now realize that alternative and possibly superior implementations
are possible as well.)
That said, I must now add that we certainly do object to
"pass by reference" if it is a feature of the model;
indeed, that's exactly one of the disagreements we have with the OO world. We don't want
application programmers (or
end-users) to have to deal with pointers at all--not even if those
pointers are only implicit (in some ways, in fact, it's worse if they're
implicit). In a word, we don't want
"REF types" in our model.
4. I'd like to
add that we also believe that (a) "REF types" (or pointers, or object
IDs, or whatever you want to call them) and (b) a good model of inheritance are
logically incompatible. To
elaborate briefly:
·
First, we claim that an inheritance model is
"good" only if it does support S by C and G by C.
·
Second, the existence of REF types means that S by C
and G by C can't be made to work.
This position too is explained in detail in the MANIFESTO
book.
I hope the foregoing helps to clarify our position. I certainly hope too that you
will read the MANIFESTO
book as you say you will, and I welcome any thoughts you might have to offer on
the ideas contained therein.
Posted
08/18/02
[ABOUT]
[QUOTES]
[LINKS]