(Continued from Part 5)
Extension and Constraint Subtypes
Toward the end of their paper, Liskov and Wing have this to
say:
“The requirement we impose on subtypes is very strong and raises
a concern that it might rule out many useful subtype relations. To address this concern we looked
at a
number of examples ... The examples led us to classify subtype relationships
into two broad categories. In the first
category, the subtype extends the supertype by providing additional methods and
possibly additional "state."
In the second, the subtype is more constrained than the supertype.“
Actually, the requirement that Liskov and Wing impose on
subtypes isn't as strong as ours! We
require every value of the subtype to be a value of the supertype (in other
words, we require proper subtypes to be proper subsets), as explained in Part 3of this series and noted
in Part 5. Nevertheless, we
believe we can handle,
cleanly, all of the various problems that--it has been claimed, at one time or
another, by one writer or another--a general model of "subtyping and
inheritance" ought to be able to handle (though it's true that we
don't always handle the problem in question by means of subtyping and
inheritance as such).
Be that as it may, let's take a look at the two categories of
subtype relationships. The first is
called "extension subtypes."
This is what Liskov and Wing have to say:
“A subtype extends its supertype if its objects have extra
methods in addition to those of the supertype.
[They] might also have more "state" ... One common example
concerns persons, employees, and students ... A person object has methods that
report its properties such as its name, age, [etc.]. Student and employee are subtypes of person;
in each case they
have additional properties, e.g., a student id number, an employee number and
salary ... In this example, the subtype objects have more state than those of
the supertype as well as more methods.”
As you know by now, in our own model, by contrast, (a) we require
subtypes to have extra methods, but (b) we reject the notion of
"state" entirely, so subtypes cannot possibly have more of
it. So what about the example of, e.g.,
persons and employees? (Let's ignore
students, for simplicity.) Well, one
way to approach the problem is to define an EMPLOYEE as a tuple of two
components, one of which is of type PERSON and the other is of type OTHER_STUFF
(employee number, for example). Then
EMPLOYEE isn't a subtype of PERSON, but any program that operates on
PERSONs will work for the PERSON component of an EMPLOYEE, and any program that
operates on EMPLOYEEs can reuse the programs that work for PERSONs. In this way, code reuse can
still be
obtained, without any need to muddy up a clear, clean, and logically sensible
model of subtyping and inheritance.
I turn now to the second category of subtypes,
"constrained subtypes." To
quote Liskov and Wing again:
“The second type of subtype relation occurs when the subtype is
more constrained than the supertype. In
this case, the supertype specification is written in a way that allows
variation in behavior among its subtypes.
Subtypes constrain the supertype by reducing the variability ... The
subtype may extend those supertype objects that it simulates by providing
additional methods and/or state. “
The last sentence here is unfortunate, it seems to me. Again, we would
require extra methods
and prohibit "state" entirely--but if (for the sake of this
discussion only) we accept the possibility of extra "state," it would
appear that the two subtype categories are thus not independent of one
another. Wouldn't such independence be
desirable? (In any case, it seems to me that if "the subtype is more
constrained than the supertype," it can't possibly have "additional
state." By way of illustration,
see Robert Martin's discussion of the rectangles-and-squares example, quoted in
Part 2
of this series). Also, I find the use of the term "simulates" a
little puzzling; are the authors suggesting that if a certain stack
"simulates" a certain bag, both the stack and the bag are
physically materialized in storage?
Anyway, so far as our own model is concerned, the two subtype
categories collapse into one. Indeed,
we think subtypes should always be "more constrained" than
their supertypes. In fact, we argue in THE THIRD MANIFESTO
that logically constraining supertypes is the only logically correct way
to define subtypes! And Liskov and Wing
seem to like the approach too:
“[We] prefer the constraint approach ... The constraint approach
is appealing because it is simple and direct ... [Including] constraints in
specifications makes for a more robust methodology ... Being able to state
everything declaratively seems like a particularly important advantage of the
constraint approach. “
Note: Actually,
the foregoing remarks apply, not to constrained subtypes as such, but rather to
what Liskov and Wing call the constraint approach to defining subtypes. However, the two
concepts are very much
interrelated, and I don't think I'm misrepresenting--at least, I hope
I'm not misrepresenting--the authors' position here. If I'm wrong, I hope they'll accept my
apologies.
Given the foregoing, it's interesting to note that James
Rumbaugh, in his article A Matter of Intent: How to Define Subclasses (Journal
of Object-Oriented Programming, September 1996)--which I quoted in Part 2of
this series, as you might recall--says almost exactly the opposite: “[Most] object-oriented
languages do not
want objects to change class ... [This] suggests [a] design principle for
classification systems: A subclass
should not be defined by constraining a superclass.”
Hmmm...
Substitutability Defined
I'd like to end this six-part series with a fairly precise
definition of the concept of substitutability--actually two such definitions,
because (as noted in Part 1we
find it necessary to distinguish between value substitutability and variable
substitutability. That distinction in
turn rests on the distinction between read-only operators and update operators--or
observers and mutators, to use the object-oriented terms, but now I definitely
do want to talk in terms of operators, not methods. Also, I'll use the term polymorphism to
mean inclusion
polymorphism specifically (as explained in Part 1 of
this series, inclusion polymorphism is the kind of polymorphism we get with
subtyping and inheritance).
Value Substitutability: Basically, if S is a subtype of T, then every
read-only operator
that applies to values of type T also applies to values of type S. More precisely:
Let Op be a read-only operator, let P be a
parameter to Op, and let T be the declared type for P. Then the declared type
of the argument
expression (and hence a fortiori the most specific type of the argument
value) corresponding to parameter P in an invocation of Op can be
any subtype S of T.
In other words, the read-only operator Op applies to values of
type T and therefore, necessarily, to values of type S--The
Principle of (Read-only) Operator Inheritance. It follows that such operators are
polymorphic, since they
apply to values of several different types--The Principle of (Read-only)
Operator Polymorphism. And it
further follows that wherever a value of type T is permitted, a value of
any type S is also permitted--The Principle of (Value)
Substitutability.
Variable Substitutability: Basically, if S is a subtype of T, then an
update operator that
applies to variables of declared type T applies to variables of declared type S
only in those cases where it makes sense.
More precisely:
Let Op be an update operator, let P be a
parameter to Op that is subject to update, and let T be the
declared type for P. Then it
might or might not be the case that the declared type (and hence a fortiori
the current most specific type) of the argument variable corresponding to
parameter P in some invocation of Op can be a proper subtype of
type T. It follows that for each
such update operator Op and for each parameter P to Op
that is subject to update, it is necessary to state explicitly for which
subtypes of the declared type T of parameter P operator Op
shall be inherited--The Principle of (Update) Operator Inheritance. (And if update operator
Op is not
inherited in this way by type S, it is not inherited by any subtype of
type S either.) Update operators
are thus only conditionally polymorphic--The Principle of (Update) Operator
Polymorphism. If Op is an
update operator and P is a parameter to Op that is subject to update
and S is a subtype of the declared type T of P for which Op
is inherited, then by definition it is possible to invoke Op with an
argument variable corresponding to parameter P that is of declared type S--The
Principle of (Variable) Substitutability.
I'll illustrate these ideas by repeating some remarks I made
in my earlier article Is a Circle an
Ellipse?. Let types T and S be ELLIPSE and CIRCLE, respectively (of
course, I do assume here that CIRCLE is a proper subtype of ELLIPSE). Then:
Ø All
read-only operations that apply to ellipse values--"get the area,"
for example--apply to circle values too (because circle values are ellipse
values); that is, read-only operations associated with type ELLIPSE are
inherited unconditionally by type CIRCLE.
However, there are some read-only operations associated with type
CIRCLE--"get the radius," for example--that don't apply to type
ELLIPSE. In other words, the set of
read-only operations that apply to circle values is a proper superset of the
set of read-only operations that apply to ellipse values.
Ø Some
update operations that apply to ellipse variables--"change the
center," for example--apply to circle variables too (By the terms
"ellipse variable" and "circle variable," I mean a variable
of declared type ELLIPSE and one of declared type CIRCLE, respectively. Note too that, in our
model, a variable of
(for example) declared type ELLIPSE contains ELLIPSE values, not "object
IDs" that somehow point to ellipses.
Our model has no notion of object IDs, or indeed objects per se.) Other
such operations--"change the a semiaxis," for example--don't apply to
circle variables too (because circle variables aren't ellipse variables; that
is, a variable of declared type CIRCLE can't hold a value that's "just an
ellipse," and changing the a semiaxis of a circle yields a result that's
"just an ellipse," in general).
In other words, update operations associated with type ELLIPSE are not
inherited unconditionally by type CIRCLE.
Moreover, there are some update operations associated with type
CIRCLE--"change the radius," for example--that don't apply to type
ELLIPSE. Thus, the set of update
operations that apply to circle variables is neither a subset nor a superset of
the set of such operations that apply to ellipse variables.
Ø Let
E be a variable of declared type ELLIPSE.
Then updating E in such a way that a = b after the update means the most
specific type of the current value of E is now CIRCLE. Likewise, updating E in such a way that a
> b after the update means the most specific type of the current value of E
is now "just ELLIPSE."
For further explanation and discussion of such matters, with
many examples, please refer to THE THIRD MANIFESTO.
Posted
09/06/02
[ABOUT]
[QUOTES]
[LINKS]