From: AS
To: Editor
Date has a challenge in Double Trouble,
Double Trouble Part 2:
"[...] tell me exactly--exactly, please!--how you
propose to count those 100 'duplicate' pennies. I do think that anyone who
advocates the position that duplicates are a good idea needs to provide a good,
convincing answer to this question. It's nearly ten years since I first issued
this challenge, and nobody has yet come up with a cogent response to it".
1.
Weigh a penny.
2.
Weigh the bag.
3.
Weigh the bag of pennies.
I haven't seen this challenge as originally issued, but
obviously it must have actually referred to "duplicates" of
nonphysical things.
Chris Date Responds: First, it was good of AS
to give me an out in his final sentence--(that my challenge must have referred
to "duplicates of nonphysical things", or in other words, to things
for which the weighing trick does not work. In fact, however, I don't believe I
need to appeal to this particular out in order to defend my position. The fact
is, I don't find the weighing algorithm convincing at all. Consider the first
step: weigh a penny: What does AS mean by "a penny"?
Suppose I give him two pennies but assert I am giving him just one. How does he
know I am wrong? The answer has to be: by counting! Thus, I submit that he has
to be able count pennies in order to be able to execute the first step of his
algorithm.
Editor Comment: In the referred Chris' article I
provided a formulation of the counting problem that conveys it better:
"Try to count a pile of pennies by throwing each back into the pile after
you count it; this is the equivalent of what duplicate proponents suggest,
without realizing it."
In Chapter 4 of PRACTICAL ISSUES IN
DATABASE MANAGEMENT I provide a more verbose version of Chris'
response. What is the distinguishing attribute of otherwise identical
entities, such as, say, cake mix boxes? In the real world, we distinguish
between such entities visually, by their distinct locations in physical
space. The lack of such distinction means there is only one entity!
Entities are countable only if they are distinguishable!
Since in the real world all entities are so
distinguishable, duplicates in the database represent "indistinguishable
multiple entities" and are, therefore, an inaccurate representation of
reality.
In a correct representation, propositions about individual
boxes would, therefore, have to include a box identifier, say, a box number,
the representative in the database of the visual "this vs. that"
distinction in the real world. Such identifiers are represented in the database
by surrogate keys.
But note carefully that AS's method implies no interest in
the individual pennies, only in their count. And as I argue in the
mentioned chapter, if individual boxes are of no interest, there should not be
rows representing them in the database. One database row for the entity type
box, with the count made explicit in a column, is the proper representation.
Thus, whether there is interest in individual entities, or only in their count,
there is no justification for duplicates in either case.
Note also that AS's reference to "nonphysical
entities" (cannot be weighed)--is particularly pertinent to database rows.
To quote from Chris's Part 1article:
"The second point is this. Suppose a given table T does
permit duplicates. Then we can't tell the difference between
"genuine" duplicates in T and duplicates that arise from errors in
data entry operations on T! For example, what happens if the person responsible
for data entry unintentionally -- that is, by mistake -- enters the very same
row into T twice? (Thanks to Fabian Pascal again for drawing my attention to
this problem.)"
Posted
05/24/02
[ABOUT]
[QUOTES]
[LINKS]