From: GJ
To: Editor
Date: 31 Mar 2005
You may be interested in this article, on O'Reilly's xml.com
site: Going Native: Making the Case for XML Databases.
How did we manage medical records and instruction manuals before
XML?
Besides the annoying misuse of the term "use case"
throughout, the author seems to think the M in XML stands for
"modeling."
The article states: "A more theoretically correct way to
say this is that XML-enabled databases have their own data model — relational,
hierarchical, object-oriented — and map instances of the XML data model to
instances of their data model. Native XML databases use the XML data model
directly." I'm not clear on what theoretical foundation the author is
referring to--is there a school of theory regarding text markup?
I think it's self-evident that any document that is marked up
(structured) with legal XML can be easily stored in a relational database, and
that any legal XML "data model" will map directly to a relational model.
Logically there just isn't any problem that
XML databases can solve that relational systems can't, and it stands to reason
that native XML databases simply can't be faster, more scalable, or more
reliable than relational systems. How does this idea get any traction?
From: JG
To: Editor
CC: GJ
I finished reading the article over my morning coffee.
You know, I write a lot of Microsoft Word documents. Fabian,
would you help me find a "Microsoft Word Database?" Then I can store
and manage my data using the Word data-model.
OK. Sorry. I must be in a really cynical mood this morning.
As I read the article, I kept coming back to the thought that
the author was thinking in terms of storing and retrieving XML documents.
People have done document storage for years. That's why we have features (in,
say, Oracle) for things such as full-text-search. It's why we have companies
that make document storage-and-retrieval systems. If someone has a lot of XML
documents to manage, I can well-understand that they might look for a product
to help them.
But I cringe at the thought of managing large amounts of data
in the form of documents. And I do not like at all assertions such as the
author's: "...the data involved does not easily fit the relational data
model."
The author also brings into the equation such things as:
"ease of management, enhanced query performance, concurrent access,
transactional safety, security." You get all these things from a good
relational database, from a good database period. They are not specific to XML,
and have nothing to do with storing XML natively.
And what does it mean to store XML "natively?" I
can only imagine that it means to store the raw, XML text. But XML database
don't do that! Actually, storing XML "natively" doesn't seem to mean
"anything". All it seems to imply is a certain amount of ease in
storing and manipulating data that you "perceive" as being in XML
form, but who knows (and who cares?) what the underlying storage mechanism
really is.
Chris Date has a fascinating discussion on atomicity in his
upcoming book. He has long argued for rich data types. I want to be careful
about putting words into his mouth, but I think he would accept the idea of
having an XML document column in a table. Why not? If we can store dates, or locations,
why not an XML document. But relational database vendors were perhaps not as
fast at supporting the easy storage and retrieval of XML documents as they
could have been, leading to the development of a market for products
specifically targeted at developers who know little more than XML.
In the end, I keep thinking that what XML developers want is
simply a way to easily store and retrieve their documents. They don't want to
think outside the box of XML, either. That latter point is probably what leads
to the demand for "XML Databases". But there's a road to hell and
damnation here somewhere, I think, if you're not careful, that I can't quite
put into words this morning. When one stops thinking in terms of storing
atomic, XML documents and begins to talk
of "storing data in XML form", I think one has
crossed a line and found that road!
From: Fabian Pascal
To: JG
How can one not be cynical, when the level of
ignorance is so absolute?
Then they should talk about document bases not
databases. Different ballgame.
If they never learn what a data model is, and they are not
required to know, and the readers are in the same state, why in the world
should any one expect anything else?
Ask him to define a database, or any one of those terms that
he's throwing around. People learn jargon—primarily from vendors or the
press—and then regurgitate them without a clue as to what they mean. This
impresses the rest of the ignorami that they know something.
Doing anything useful with data requires structure, integrity,
and manipulation. Problem is XML had initially only structure—and even
that is a bad one, thrown away decades ago by Codd. Why do you think they had
to come up with Schema and XQuery? Because they could not do anything with a
bunch of tags, which can be anything anybody wants.
Sure. But for complex data types you would still have to come
up with operators and constraints that are agreed on. And that's tough. Read
the first chapter in PRACTICAL
ISSUES IN DATABASE MANAGEMENT.
That's what ignorance does. Read my Lenin, Trotsky, and Freedom from Tyranny of
Reason and Knowledge.
Posted 5/27/05