COMMENTS ON “RELATIONAL DATABASE WITHOUT RELATIONS”
by Chris Date

 

 

 

A reader has brought to our attention an a rticle with the above title by one, Eugene Bereznuk, regular columnist for the Builder.com web site.

 

Chris Date Comments: Bereznuk commits several familiar errors (including in particular failing to distinguish adequately between model and implementation), but I will focus here on his major suggestion, which I will characterize--partly using Bereznuk's own words--as follows: 

 

1.      He wants to "just create objects (like Person, Address, Product, etc.)"--and put them in the database, presumably.  Also, I assume Bereznuk really wants to create individual persons, addresses, products, etc.; "Person, Address, Product, etc." sound like relvars to me.  (An OO person might say they sound like "classes," but that's a separate argument, and I don't want to get into it here.)  Note: If you're not familiar with the term relvar (short for relation variable), I venture to suggest you ought to be.

 

2.      It's not clear to me whether those "persons, addresses, products, etc." are supposed to be represented by tuples in relvars or whether they're just floating freely in space, as it were.  Either way, however, I understand that the system is supposed to assign them "globally unique identifiers" (GUIDs) when they're "created."  By the way:  If they are represented by tuples in relvars, I hope the GUIDs have a user-visible attribute of their own; for otherwise we have a violation of The Information Principle on our hands.

 

3.      In conventional relational databases, if I want to establish the relationship "person has address," I create a relvar R1 with attributes PERSON and ADDRESS; if I want to establish the relationship "person buys product," I create a relvar R2 with attributes PERSON and PRODUCT; and so on.  Thus, R1 corresponds to the predicate--i.e., R1 means--"person has address"; R2 corresponds to the predicate--i.e., R2 means--"person buys product"; and so on.  Every relvar has a predicate.

 

4.      What Bereznuk wants to do is create a kind of general-purpose relvar R with attributes GUIDA and GUIDB.  Then he can relate anything to anything, dynamically, by inserting a tuple containing the pertinent GUIDs into R.

 

5.      But what's the predicate for R?  I submit it can only be something like "the thing identified by GUIDA is related to the thing identified by GUIDB."  What the nature of that relationship is, however, is and must be completely unspecified!--i.e., it's anybody's guess.  For example, if in some tuple of R GUIDA identifies a person and GUIDB a product, I have no idea whether the tuple means the person bought the product, or liked the product, or didn't like the product, or made the product, or any of a literally infinite number of other possible interpretations.  So now a large part of the meaning of the database is buried in Bereznuk's mind:  It's hidden from other users, and it's certainly hidden from the system.  And we're supposed to be talking about a shared database!  (At least, I assume so.  If all Bereznuk is talking about is a private database that's for his use alone, then he can do what he likes with it, of course.)

 

6.      And what if want to say in R both "the person bought the product" and "the person broke the product"?

 

7.      Now, we might rescue Bereznuk's idea to some extent by adding an attribute to R to identify the relationship (in effect, the meaning).  Then we could have, e.g., both "ida, idb, bought" and "ida, idb, broke" in R at the same time.

 

8.      But even if we do "rescue" the idea along the lines just suggested, there are still at least two significant problems:

 

·   Judicious use of the type mechanism in relational databases allows us to avoid silly errors (indeed, this is precisely why logicians introduced "sorted logics"--"sorted" here really meaning "typed").  In Bereznuk's scheme, there's nothing to stop us entering nonsenses into the database such as "ida, idb, bought" when ida = Enron and idb = GWB.  (On second thoughts ... Let me change that example to ida = Camembert and idb = Mount Everest.) 

 

·   What about relationships that involve more than two things?  Some examples:  Supplier Sx supplies part Py to project Jz; City Y is on the way from city X to city Z; Points A, B, C, D, E uniquely define a certain pentangle; and so on.  The idea that all relationships are binary, or can adequately be represented in pure binary form, was surely and soundly debunked many years ago (see, e.g., Codd's first papers on the relational model). 

 

I really think that if you're going to offer criticism of the status quo in any field, you owe it to yourself and your audience to inform yourself thoroughly of the history of that field first.  In particular, you should know what was tried in the past and found wanting, and you should know why. 

 

PS: With respect to point 2 above:  If the "objects" are supposed to be just "floating freely in space," what do they mean?  (This is a criticism of OO in general, of course, and not a new one.)  E.g., what does the integer 3, in isolation, mean?  I submit that meaning is conferred by context, and only by context.  Thus, I can sensibly say, "There are three weeks to go before my vacation" or "Charley has three daughters," but I cannot sensibly say, in a vacuum, just "three."

 

 

Posted 08/18/02

 

 

 

[ABOUT] [QUOTES] [LINKS]