WHAT DOES SUBSTITUTABILIT YREALLY MEAN? PART 6
by C. J. Date

 

 

 

(Continued from Part 5)

 

 

Extension and Constraint Subtypes

 

Toward the end of their paper, Liskov and Wing have this to say: 

 

“The requirement we impose on subtypes is very strong and raises a concern that it might rule out many useful subtype relations.  To address this concern we looked at a number of examples ... The examples led us to classify subtype relationships into two broad categories.  In the first category, the subtype extends the supertype by providing additional methods and possibly additional "state."  In the second, the subtype is more constrained than the supertype.“

 

Actually, the requirement that Liskov and Wing impose on subtypes isn't as strong as ours!  We require every value of the subtype to be a value of the supertype (in other words, we require proper subtypes to be proper subsets), as explained in Part 3of this series and noted in Part 5.  Nevertheless, we believe we can handle, cleanly, all of the various problems that--it has been claimed, at one time or another, by one writer or another--a general model of "subtyping and inheritance" ought to be able to handle (though it's true that we don't always handle the problem in question by means of subtyping and inheritance as such). 

 

Be that as it may, let's take a look at the two categories of subtype relationships.  The first is called "extension subtypes."  This is what Liskov and Wing have to say: 

 

“A subtype extends its supertype if its objects have extra methods in addition to those of the supertype.  [They] might also have more "state" ... One common example concerns persons, employees, and students ... A person object has methods that report its properties such as its name, age, [etc.].  Student and employee are subtypes of person; in each case they have additional properties, e.g., a student id number, an employee number and salary ... In this example, the subtype objects have more state than those of the supertype as well as more methods.” 

 

As you know by now, in our own model, by contrast, (a) we require subtypes to have extra methods, but (b) we reject the notion of "state" entirely, so subtypes cannot possibly have more of it.  So what about the example of, e.g., persons and employees?  (Let's ignore students, for simplicity.)  Well, one way to approach the problem is to define an EMPLOYEE as a tuple of two components, one of which is of type PERSON and the other is of type OTHER_STUFF (employee number, for example).  Then EMPLOYEE isn't a subtype of PERSON, but any program that operates on PERSONs will work for the PERSON component of an EMPLOYEE, and any program that operates on EMPLOYEEs can reuse the programs that work for PERSONs.  In this way, code reuse can still be obtained, without any need to muddy up a clear, clean, and logically sensible model of subtyping and inheritance. 

 

I turn now to the second category of subtypes, "constrained subtypes."  To quote Liskov and Wing again: 

 

“The second type of subtype relation occurs when the subtype is more constrained than the supertype.  In this case, the supertype specification is written in a way that allows variation in behavior among its subtypes.  Subtypes constrain the supertype by reducing the variability ... The subtype may extend those supertype objects that it simulates by providing additional methods and/or state. “

 

The last sentence here is unfortunate, it seems to me.  Again, we would require extra methods and prohibit "state" entirely--but if (for the sake of this discussion only) we accept the possibility of extra "state," it would appear that the two subtype categories are thus not independent of one another.  Wouldn't such independence be desirable? (In any case, it seems to me that if "the subtype is more constrained than the supertype," it can't possibly have "additional state."  By way of illustration, see Robert Martin's discussion of the rectangles-and-squares example, quoted in Part 2 of this series). Also, I find the use of the term "simulates" a little puzzling; are the authors suggesting that if a certain stack "simulates" a certain bag, both the stack and the bag are physically materialized in storage?

 

Anyway, so far as our own model is concerned, the two subtype categories collapse into one.  Indeed, we think subtypes should always be "more constrained" than their supertypes.  In fact, we argue in THE THIRD MANIFESTO that logically constraining supertypes is the only logically correct way to define subtypes!  And Liskov and Wing seem to like the approach too: 

 

“[We] prefer the constraint approach ... The constraint approach is appealing because it is simple and direct ... [Including] constraints in specifications makes for a more robust methodology ... Being able to state everything declaratively seems like a particularly important advantage of the constraint approach. “

 

Note:  Actually, the foregoing remarks apply, not to constrained subtypes as such, but rather to what Liskov and Wing call the constraint approach to defining subtypes.  However, the two concepts are very much interrelated, and I don't think I'm misrepresenting--at least, I hope I'm not misrepresenting--the authors' position here.  If I'm wrong, I hope they'll accept my apologies. 

 

Given the foregoing, it's interesting to note that James Rumbaugh, in his article A Matter of Intent: How to Define Subclasses (Journal of Object-Oriented Programming, September 1996)--which I quoted in Part 2of this series, as you might recall--says almost exactly the opposite:  “[Most] object-oriented languages do not want objects to change class ... [This] suggests [a] design principle for classification systems:  A subclass should not be defined by constraining a superclass.” 

 

Hmmm...

 

 

Substitutability Defined

 

I'd like to end this six-part series with a fairly precise definition of the concept of substitutability--actually two such definitions, because (as noted in Part 1we find it necessary to distinguish between value substitutability and variable substitutability.  That distinction in turn rests on the distinction between read-only operators and update operators--or observers and mutators, to use the object-oriented terms, but now I definitely do want to talk in terms of operators, not methods.  Also, I'll use the term polymorphism to mean inclusion polymorphism specifically (as explained in Part 1 of this series, inclusion polymorphism is the kind of polymorphism we get with subtyping and inheritance). 

 

Value Substitutability:  Basically, if S is a subtype of T, then every read-only operator that applies to values of type T also applies to values of type S.  More precisely: 

 

Let Op be a read-only operator, let P be a parameter to Op, and let T be the declared type for P.  Then the declared type of the argument expression (and hence a fortiori the most specific type of the argument value) corresponding to parameter P in an invocation of Op can be any subtype S of T.  In other words, the read-only operator Op applies to values of type T and therefore, necessarily, to values of type S--The Principle of (Read-only) Operator Inheritance.  It follows that such operators are polymorphic, since they apply to values of several different types--The Principle of (Read-only) Operator Polymorphism.  And it further follows that wherever a value of type T is permitted, a value of any type S is also permitted--The Principle of (Value) Substitutability

 

Variable Substitutability:  Basically, if S is a subtype of T, then an update operator that applies to variables of declared type T applies to variables of declared type S only in those cases where it makes sense.  More precisely: 

 

Let Op be an update operator, let P be a parameter to Op that is subject to update, and let T be the declared type for P.  Then it might or might not be the case that the declared type (and hence a fortiori the current most specific type) of the argument variable corresponding to parameter P in some invocation of Op can be a proper subtype of type T.  It follows that for each such update operator Op and for each parameter P to Op that is subject to update, it is necessary to state explicitly for which subtypes of the declared type T of parameter P operator Op shall be inherited--The Principle of (Update) Operator Inheritance.  (And if update operator Op is not inherited in this way by type S, it is not inherited by any subtype of type S either.)  Update operators are thus only conditionally polymorphic--The Principle of (Update) Operator Polymorphism.  If Op is an update operator and P is a parameter to Op that is subject to update and S is a subtype of the declared type T of P for which Op is inherited, then by definition it is possible to invoke Op with an argument variable corresponding to parameter P that is of declared type S--The Principle of (Variable) Substitutability

 

I'll illustrate these ideas by repeating some remarks I made in my earlier article Is a Circle an Ellipse?. Let types T and S be ELLIPSE and CIRCLE, respectively (of course, I do assume here that CIRCLE is a proper subtype of ELLIPSE).  Then: 

 

Ø       All read-only operations that apply to ellipse values--"get the area," for example--apply to circle values too (because circle values are ellipse values); that is, read-only operations associated with type ELLIPSE are inherited unconditionally by type CIRCLE.  However, there are some read-only operations associated with type CIRCLE--"get the radius," for example--that don't apply to type ELLIPSE.  In other words, the set of read-only operations that apply to circle values is a proper superset of the set of read-only operations that apply to ellipse values.

 

Ø       Some update operations that apply to ellipse variables--"change the center," for example--apply to circle variables too (By the terms "ellipse variable" and "circle variable," I mean a variable of declared type ELLIPSE and one of declared type CIRCLE, respectively.  Note too that, in our model, a variable of (for example) declared type ELLIPSE contains ELLIPSE values, not "object IDs" that somehow point to ellipses.  Our model has no notion of object IDs, or indeed objects per se.) Other such operations--"change the a semiaxis," for example--don't apply to circle variables too (because circle variables aren't ellipse variables; that is, a variable of declared type CIRCLE can't hold a value that's "just an ellipse," and changing the a semiaxis of a circle yields a result that's "just an ellipse," in general).  In other words, update operations associated with type ELLIPSE are not inherited unconditionally by type CIRCLE.  Moreover, there are some update operations associated with type CIRCLE--"change the radius," for example--that don't apply to type ELLIPSE.  Thus, the set of update operations that apply to circle variables is neither a subset nor a superset of the set of such operations that apply to ellipse variables.

 

Ø       Let E be a variable of declared type ELLIPSE.  Then updating E in such a way that a = b after the update means the most specific type of the current value of E is now CIRCLE.  Likewise, updating E in such a way that a > b after the update means the most specific type of the current value of E is now "just ELLIPSE." 

 

For further explanation and discussion of such matters, with many examples, please refer to THE THIRD MANIFESTO.

 

 

Posted 09/06/02

 

 

 

[ABOUT] [QUOTES] [LINKS]