MORE ON TREES
with Fabian Pascal

 

 

 

From: JH

To: Editor

Date: 29 Oct 2004

 

In response to this week's article, More On Trees And XML, I'd like to share the solution to my tree dilemma.  In short, I prove you are correct that "What" is more important than "How".

 

I wanted a hierarchy of categories for content on my website.  This would be represented in the path portion of the URL.  After finding recursion and nested sub-queries excessively silly, I began looking for alternatives. That's where nested sets cropped up.  After realizing so-called "experts" weren't really all that great, I continued working, but eventually gave up on the SQL route.

 

While toying with LDAP, I realized my previous attempts were completely wrong.  I was querying by path.  The path was a key.  The path didn't really mean much, other than a nice way to reference some particular place in the category tree.  Not once did I want to know every single item under a node or every single item above a node.  Occasionally the nodes immediately above or below the current node needed to be known.  That was it.

 

With those limits in place, an implementation was quickly worked up.  The schema was simple.  The primary key for nodes is the path.  Nodes may have a parent, which is in a separate relvar.  Content references a particular node.  As if the simplicity gained there wasn't enough, a more significant change occurred on the code side.  What once required nearly a hundred lines of code was replaced with two.  One to sanitize the path, the other to perform the query.

 

The hierarchy is necessary, but it is not what is most important.  The thing to remember is that navigating a hierarchy is not a good substitute for a key.  However, a key can be an excellent substitute for navigating a hierarchy.

 

 

Fabian Pascal Responds: We agree and are on record that tree-structured data can be represented and manipulated relationally much more productively than navigationally by hierarchic databases and DBMS

(see our PRACTICAL DATABASE FOUNDATIONS paper #9, Tree-Structured Data: A Relational Perspective). It's not the relational model's fault that SQL and its implementations do not support the pertinent functionality fully and correctly.

 

However, there may be instances where you do need to know all the descendents or parents of a node e.g. bill of material explosions, so you must be careful in deriving general design principles from simple structures. The former should precede their application to the latter.

 

 

Posted 11/19/04