MORE ON THE REAL BLOOMING IDIOT
by Fabian Pascal

 

 

 

I have commented several times on the pronouncements of one Rudy aka r937 (see Note on the Real Blooming Idiot, Note on Consistency ), as well as the common tendency to resort to personal invectives due to absence of knowledge and reasoning ability with which to rebut arguments. I've recently come across yet another excellent example of such. In an exchange in a Sitepoint forum, Matt Rogish stated:

 

Without a definition of database it is hard to say whether XML can be called a database or not.

 

Dictionary.com calls a database 'a collection of data arranged for ease and speed of search and retrieval' and 'an organized body of related information'.

 

So, an XML document *could* be a database in exactly the same way a flat-ASCII comma-delimited (CSV) file would be. It is simply a bunch of data in a location that you can look at.

 

However when most people say 'database' they really mean 'database management system'--a collection of functions/applications which allow you to securely retrieve and modify data *in* a database.

 

The idea that XML is a superior machine-to-machine data exchange format has been thoroughly debunked by Fabian Pascal at www.dbdebunk.com. He contends that once you agree on a standard (in XML parlance an XSD) that the physical data format (XML's hierarchical method) can be anything.

 

Mr. Pascal contends that as an exchange format XML is inherently inferior to many existing technologies. I wholeheartedly agree with him. XML is, by its tag and hierarchical nature, inefficient. I can store more data in less space with a CSV file than I can with XML.

 

Quote (The Data Exchange Tail - Part 2):

 

Data exchange requires agreement on (a) what data is to be exchanged, and (b) its physical format, which are orthogonal (independent) considerations. Suppose, for example, that a personnel management system feeds data to a payroll system. For this to work, the two departments must agree on what personnel data is to be fed (say, name, position, seniority, and so on) and the physical format in which it will be transmitted (say, ASCII [comma] delimited).

 

Note very carefully that when they agree on the data, the departments actually agree on a common meaning of that data. This must be the case, because the agreement derives from their own systems, which contain the two departments' logical models, within which the data must fit. Note also that once the common meaning is agreed upon, the payroll system does not need to be told "what the data is" each time data is sent to it by the personnel system. Indeed, that's the point of the upfront agreement in the first place. Thus, given an agreed meaning, data exchange requires only a physical format which, as I mentioned, is orthogonal to meaning. Any format will do, as long as it is agreed upon. Now, the industry lacks many things, but format is hardly one of them; there is a plethora of physical formats (see conclusion on this point) to choose from. So why invent yet a new one?

 

The fad-driven computer industry, being told that XML was the cure to all the world’s problems, jumped on it and is coming out with everything from XSLT stylesheets to XML-powered kitchen appliances. The problem is that it is an inefficient solution to the problem.

 

Certainly there has been a lot of development in utility applications and libraries to manage and extend XML (XPATH, XSLT, etc.). This raises the value of XML because you, as the application developer, do not have to write libraries to change, search, convert to HTML, etc. XML documents.

 

Does this mean XML is superior? Of course not. Had developers put the same amount of effort in creating libraries to transform a CSV file into HTML as they did XML (and I contend it would have taken less time and effort) then you could just as easily, if not more so, use a flat-file as a data source for your web application. It's just that the XML libraries are superior to, say, CSV libraries (since I doubt many exist).

 

So feel free to use XML and XSLT—not because XML is superior, just that it happens to be in vogue now. Just know that if the computer industry follows past behavior they’re more than likely going to throw it all away in a few years and start over with the 'next best thing since sliced bread'.

 

Note the reasoned, technical basis on which both Matt and myself base our arguments. And what does Rudy understand from all this?

 

whoa, that fabian guy sure hates xml too, doesn't he? is there anything he does like, besides himself?

 

i'm sorry, but that's the attitude that percolates through his writing--all the emphasis, the thinly veiled sarcasm, the you're-all-nuts-and-i'm-not tone that filters through it all. i don't care if the guy's right or not, i cannot stomach him.

 

Regarding nuts: when the shoe fits...

 

 

Posted 9/3/05