ON XML, OBJECTS, HIERARCHIES, AND THE RELATIONAL MODEL
with C. J. Date

 

 

 

From: JD

To: Editor

Date: 6 Sep 2004

 

It has been a curious thing to me the wasteful nature of XML's pointless markup. This is the case of course for repetitive structures (of course the same can't be said for XML Documents). What do you think of ASN.1? Last time I checked, a few weeks ago there was an easy XML->ASN.1 interface available, which means you could translate a hierarchical data structure from XML to an efficient binary format. (XML is wasteful even in documents).

 

XML is of course merely a simpler version of SGML, and as far as I know, SGML's hugely more customizable nature means DTDs can be written for every type of data file, for example hosts, resolv.conf etc in Unix.

 

As an industry, we are used to SQL databases, and as you point out, the term "relational" should not really be applied to them, but this happens in language: for example "culprit" comes from "culprare" which means to blame. (culprit literally means he that is blamed, not "he that did it").

 

I can't help thinking a more "relational" database is the language Prolog. (I think in most versions you can store the current ruleset on disk). Hierarchical relationship could be done by "parentOf(A,B)".Giving an 'object' a type of MELON would be objectType(A,MELON). Of course, the "database" in this case has no understanding of what the relationship *means*. To the "database" the relationship is just another rule.

 

Inevitably you'd have to write a shim class to do hierarchy walking (not hard of course).

 

If a Prolog was written to persist it's data on disk specifically, I think that would be a HELL of a database!

 

(Note of course you can layer a hierarchy on top of SQL by making a field "parent" (even in a different table) and another two fields "previous peer" and "next peer". (First Child would be a good one to add). Of course, to use a hierarchy a simple shim class could be written).

 

I've done a lot of work with Object Databases storing XML and SGML and one noticeable problem seems to be performance (wasting space on element names won't help of course). Perhaps the SQL database's rise is due to high performance coupled with a low implementation of relational-ism.

 

 

Chris Date Responds: You might want to investigate Datalog. See AN INTRODUCTION TO DATABASE SYSTEMS 8th Ed. for an introduction and J. Ullman, PRINCIPLES OF DATABASE AND KNOWLEDGE BASE SYSTEMS, Computer Science Press, 1988.

 

 

Posted 11/19/04