From: JH
To: Editor
Date: 29 Oct 2004
In response to this week's article, More On Trees And XML,
I'd like to share the solution to my tree dilemma. In short, I prove you are correct that "What" is more
important than "How".
I wanted a hierarchy of categories for content on my
website. This would be represented in
the path portion of the URL. After
finding recursion and nested sub-queries excessively silly, I began looking for
alternatives. That's where nested sets cropped up. After realizing so-called "experts" weren't really all
that great, I continued working, but eventually gave up on the SQL route.
While toying with LDAP, I realized my previous attempts were
completely wrong. I was querying by
path. The path was a key. The path didn't really mean much, other than
a nice way to reference some particular place in the category tree. Not once did I want to know every single
item under a node or every single item above a node. Occasionally the nodes immediately above or below the current
node needed to be known. That was it.
With those limits in place, an implementation was quickly
worked up. The schema was simple. The primary key for nodes is the path. Nodes may have a parent, which is in a
separate relvar. Content references a
particular node. As if the simplicity
gained there wasn't enough, a more significant change occurred on the code
side. What once required nearly a
hundred lines of code was replaced with two.
One to sanitize the path, the other to perform the query.
The hierarchy is necessary, but it is not what is most
important. The thing to remember is
that navigating a hierarchy is not a good substitute for a key. However, a key can be an excellent
substitute for navigating a hierarchy.
Fabian Pascal Responds: We agree and are on record
that tree-structured data can be represented and manipulated relationally much
more productively than navigationally by hierarchic databases and DBMS
(see our PRACTICAL
DATABASE FOUNDATIONS paper #9,
Tree-Structured Data: A
Relational Perspective). It's not the relational model's fault that SQL
and its implementations do not support the pertinent functionality fully and
correctly.
However, there may be instances where you do need to know all
the descendents or parents of a node e.g. bill of material explosions, so you
must be careful in deriving general design principles from simple structures.
The former should precede their application to the latter.
Posted 11/19/04