From: JPD
To: Editor
Regarding Date's article titled Encapsulation Is a Red Herring:
I think I might disagree with some of the points in the
article - or maybe I don't.
To me, there are occasions where a complicated concept should
be exposed to a user only as a scalar (even if, as you say, the representation
is a list of arrays of stacks of trees or something similar 'under the covers')
and there are cases where that same concept should be accessible to the user not
as a scalar, i.e. the user should have full access to the lists, the arrays,
the stacks, the trees, and anything else in the 'structure'. I would say this
second case is precisely the case when the lists, stacks, trees, etc. are
intended to be dealt with on a logical basis, as opposed to merely being the
physical representation of some data with independent logical meaning -
although it certainly may be the case that the physical representation is
identical to the logical concept being expressed. I would say that in this
second case, where full access is a legitimate possibility on logical grounds,
there is a lot of value in allowing the user to choose whether or not to treat
any of the lists, arrays, stacks, trees, or the entire pile of data, as a
scalar, on a subjective basis - that is, treat the entire business as a scalar
for one operation, and treat it as an open collection of sub-scalars for the
next.
I could not decide if your position was against the above,
but your statement that dates and times are scalars (in an objective sense)
made it sound like you might be against the above. I contend that it is a
subjective design decision whether:
1.
a date should be a scalar and nothing but a scalar at all
times and that the year, month, and day fall strictly under the 'internal
representation' umbrella not to be confused with the logical concept of a date,
or
2.
a date should logically be a 3-tuple of year, month, and day
attributes which the user can at any time, at his discretion, choose to treat
together as a scalar value.
I contend that there is no objective answer to the foregoing,
that the relational model makes no claim one way or the other, and that a truly
relational language must be expressive enough to allow a database designer to
implement either option 1 or 2 above as need dictates, and in the case of
option 2, must be expressive enough to allow the user to treat composite data
as scalar data as he pleases, wherever allowed by the designer.
You make the point that 'encapsulation conflicts somewhat
with the need to be able to perform ad hoc queries', but you also conclude that
'our term scalar means exactly the same thing as encapsulated'. Therefore
scalar conflicts with ad-hoc queries. True enough for scalars types defined
according to option 1, and your recourse in such a case has been to require
creation of operators to expose relevant aspects of the representation; less
true when the scalar nature of some quantity of data is left to the user to
decide as in option 2. I might comment that if some component of a
representation is important enough that it needs to be queried on, then maybe
that component can't be so cleanly distinguished from the logical type.
C. J. Date Responds: I'm very sympathetic to most of
the points in this message. Rather than
indulge in a blow-by-blow response, however, let me just say that I believe
answers to most of those points can be found in my book AN INTRODUCTION TO DATABASE
SYSTEMS, 8th Ed. As for the
term scalar: I've come to the conclusion that this term has no absolute
meaning. See my PRACTICAL DATABASE FOUNDATIONS paper
#1, What First Normal
Form Really Means.
PS: I do think
it's a mistake to treat a date as a tuple, though, because ">"
doesn't apply to tuples; in other words, you couldn't handle chronological
ordering.
Posted
09/03/04