From: Curt Sampson
To: Editor
Date: 26 Feb 2005
1. In NULLS
Nullified, you write:
...there is a
commercial product with a language based on the principles advanced in The
Third Manifesto: Dataphor by Alphora. It is interesting to note that the
company refrained from implementing NULLs, in order to avoid the problems
haunting SQL, but recently had to give in to market pressure and added them....
This ties in with some recent thoughts I was having on
support for user-defined types. I consider one of the big problems with current
DBMSs to be their poor support for UDTs; I think that that is one of the
reasons that OO folks, used to rich facilities for type
definition, tend to use RDBMSs poorly.
2. Reading THE
THIRD MANIFESTO, as well, the very first RM prescription states that,
"D shall provide facilities for users to define their own scalar
types," and the third one states that, "D shall provide
facilities for users to define their own scalar
operators."
Does Dataphor not provide this?
3. If so, I'm not sure what "adding NULLs" means,
since any system that lets you define your own types would let you define a type
with a domain of, say, true, false or "NULL", or all integers plus
"NULL", and define appropriate operators for them.
4. This may be a slight disagreement between us; I'm not in
favour of abolishing the capability of having NULLs the way we have them in
SQL; to do so would be to put an arbitrary limit on users' abilities to
define types. However, I'm quite strongly in agreement with
you that the system defined types in SQL include NULL (in SQL, I consider
columns "foo int NOT NULL" and "foo int NULL" to be
distinct types—they are certainly different domains!) are poorly designed
and one would be ill-advised to use them. And I would certainly advocate not
including such types in the basic system. User defined types may give users
enough rope to hang themselves with, but when you buy an RDBMS, I see no reason
to include a noose in the package.
5. To continue, I see nothing wrong with being able to define
a type consisting of, say, all integers plus "unknown" (UNK), and use
that for, say, an attribute giving a count of the number of residents of a
dwelling. The great advantage here is that you have a type
where this value says what it means explicitly (it can't be misinterpreted as
"nobody," or "not applicable"), and you are forced to
correctly define operators based on the real semantics of the situation. For
example, when you first try to apply count() to that attribute in a relation,
the system will return an error message stating that count() is not
defined for that type, and you can sit down and decide just
what count() should do. Perhaps it should return UNK if any tuple has UNK in
that attribute. Perhaps you really want a minimum_count() or count_known()
function instead. Perhaps you even want "count(residents assuming UNK =
average(residents))"
Is this any less sensible a way of dealing with this than
Darwen's proposal to move that attribute into a couple of separate relations?
It certainly makes the constraints easier to define.
Should a DBMS offer this sort of facility, make it easily
enough available to users, and properly document and promote it, I would think
that the demand for support for all of SQL's data types that include NULL would
be much, much less.
From: Fabian Pascal
To: Curt Sampson
·
Agreed.
·
As far as I know, yes. Check with them for details.
·
Disagree. No connection.
·
Strong disagreement. See The Final NULL in the
Coffin.
·
Disagree. See the paper.
·
I do not subscribe to Hugh's approach. See On Darwen’s Handling of
Missing Information Without Using NULLs
·
Disagree.
Posted 4/15/05