From: DT
To: Editor
I had a question about the missing-values suggestion in PRACTICAL
ISSUES IN DATABASE MANAGEMENT, page 234. You write:
"Table operations would have to be modified to yield
results with as many tables as there are types of propositions with only known
values."
How would this be represented in a language like Tutorial
D, where relvars are required to be strongly typed?
One possible idea is to make use of type inheritance. Suppose
I had a domain of tuple values {x,a,b,c} (all integers, say) where x is not
allowed to be missing but a, b, and c are allowed to be missing. Suppose we
extended the domains of a, b, and c with an "imaginary" special value
that we will never represent, which I will show for diagram purposes only as
'?'. Then the domain can be split into parts:
XABC {x,a,b,c} possrep: {X: int, A: int, B: int, C: int}
XAB {x,a,b,'?'} possrep: {X: int, A: int, B: int}
XAC {x,a,'?',c} possrep: {X: int, A: int, C: int}
XBC {x,'?',b,c} possrep: {X: int, B: int, C: int}
XA {x,a,'?','?'} possrep: {X: int, A: int}
XB {x,'?',b,'?'} possrep: {X: int, B: int}
XC {x,'?','?',c} possrep: {X: int, C: int}
X {x,’?','?','?'} possrep: {X: int}
Using Mr. Date's specialization by constraint idea, we
can inherit all the subtuple types from the main tuple type. Updates could make
a tuple change type. A relation of relations of XABC type could be used to
return results of a query. Each relation within the relation would contain one
subtype.
However, the exponential explosion of possible subtypes would
be very difficult to handle, practically speaking. As you admit in your book, a
real DBMS might have to handle thousands of small subtables. This cannot be
passed off as an "implementation detail" since table operations
"yield results" at the user presentation level. No matter how
efficient the underlying system might be, this seems unacceptable. Perhaps we
have to fall back on default values after all.
Fabian Pascal Responds: Tutorial D does not
explicitly incorporate the concept of missing data as metadata expounded in my PRACTICAL ISSUES
IN DATABASE MANAGEMENT, which originates with David McGoveran. THE THIRD
MANIFESTO refers in an appendix to Chris Date's default value scheme
included in his RELATIONAL DATABASE
WRITINGS 1991-1994. He proposed it only
as a better solution than NULLs, but it does not address the fundamental
meta-data nature of missing data that I raise in my book. He subscribes to the meta-data approach as the theoretically
correct solution and we both agree that the lack of elegant, simple solution is
inherent in the nature of missing data, because it is outside the scope
of the two-valued logic of the real world. Incidentally, the term "NULL
value" is a contradiction in terms: NULLs are not values [and thus
violate Codd’s Information Principle].
Posted
04/05/02
[ABOUT]
[QUOTES]
[LINKS]