Sunday, January 16, 2022


Note: To demonstrate the correctness and stability offered by a sound theoretical foundation (relative to the industry's fad-driven "cookbook" practices), I am re-publishing as "Oldies But Goodies" material from the old (2000-06), so that you can judge for yourself how well my arguments hold up and whether the industry has progressed beyond the misconceptions those arguments were intended to dispel. I may revise, break into parts, and/or add comments and/or references, which I enclose in square brackets).

A 2001 review of my third book triggered an exchange on SlashDot. This six-part series comprises my debunking at the time of both the review and the exchange in the chronological (slightly out of the)  order of the original publication.
Part 1: Clarifications on a Review of My Book Part 1
Part 2: Slashing a SlashDot Exchange Part 1
Part 3: Slashing a SlashDot Exchange Part 2
Part 4: Slashing a SlashDot Exchange Part 3
Part 5: Slashing a SlashDot Exchange Part 4
Part 6: Clarifications on a Review of My Book Part 2

Slashing a SlashDot Exchange Part 3

(first published in 2001

The following comments being debunked are by the W3C XML Query Working Group's Activity Lead and by an academic. [The exchange took place when XML DBMS was one of the hottest fads as late as 2013.  Consider them in this context: where are XML DBMSs today?]

“The article seems to say ‘I don’t like SQL and I don’t like XML and I think XML Query is about merging them although I don’t understand it very well, so the people working on XML Query must be stupid, and in any case it’s easier to attack people than understand a specification.’ Perhaps that’s unfair, but it’s clear to me that the writer is a little fuzzy on the design goals of XML and also on the focus of SQL development over the past 10 or 15 years. In both cases the story is about interoperability.”


DBDebunk was maintained and kept free with the proceeds from my @AllAnalitics column. The site was discontinued in 2018. The content here is not available anywhere else, so if you deem it useful, particularly if you are a regular reader, please help upkeep it by purchasing publications, or donating. On-site seminars and consulting are available.Thank you.


01/08 OBG: No Understanding Without Foundation Knowledge Part 2 -- Debunking an Online Exchange 1

01/01 Schema and Performance: Never the Twain Shall Meet

02/08 Aded the 2021 to POSTS page.

12/17 OBG: No Understanding Without Foundation Knowledge Part 1: Reviewing a Book Review

12/11 Nobody Understands the Relational Model: Semantics, Closure and Database Correctness Part 4

- 08/19 Logical Symmetric Access, Data Sub-language, Kinds of Relations, Database Redundancy and Consistency, paper #2 in the new UNDERSTANDING THE REAL RDM series.
- 02/18 The Key to Relational Keys: A New Understanding, a new edition of paper #4 in the PRACTICAL DATABASE FOUNDATIONS series.
- 04/17 Interpretation and Representation of Database Relations, paper #1 in the new UNDERSTANDING THE REAL RDM series.
- 10/16 THE DBDEBUNK GUIDE TO MISCONCEPTIONS ABOUT DATA FUNDAMENTALS, my latest book (reviewed by Craig Mullins, Todd Everett, Toon Koppelaars, Davide Mauri).

- To work around Blogger limitations, the labels are mostly abbreviations or acronyms of the terms listed on the
FUNDAMENTALS page. For detailed instructions on how to understand and use the labels in conjunction with the that page, see the ABOUT page. The 2017 and 2016 posts, including earlier posts rewritten in 2017 were relabeled accordingly. As other older posts are rewritten, they will also be relabeled. For all other older posts use Blogger search.
- The links to my columns there no longer work. I moved only the 2017 columns to dbdebunk, within which only links to sources external to AllAnalytics may work or not.

I deleted my Facebook account. You can follow me:
- @DBDdebunk on Twitter: will link to new posts to this site, as well as To Laugh or Cry? and What's Wrong with This Picture? posts, and my exchanges on LinkedIn.
- @ThePostWest on Twitter where I comment on global #Antisemitism/#AntiZionism and the Arab-Israeli conflict.


Wow! I do not understand??? [Upside down and backwards, but I leave it to the reader to judge. What is one to make of [somebody claiming to be a data managemet leader] whose conclusion from reading my article reduces to "Fabian Pascal doesn’t like SQL and doesn’t like XML", completely ignoring the evidence and reasoning? He seems unable to differentiate between reasoned substantive criticism and personal dislike and attacks.

Where in my article did I say anything about "merging XML and SQL" and what exactly does that mean? [XML is hierarchic and, thus, grounded in the directed graph data model (GDM), while SQL is -- at least in intention -- a relational data sublanguage grounded in the relational data model (RDM), a superior alternative to GDM for "non-network applications" -- what sense does it make to merge them?] Before we come up with any specifications aren’t we supposed to be clear about what exactly it is that we are specifying and for what purposes?

Perhaps that’s unfair? There I was, deploring the lack of the soundness of logic and math as a scientific foundation for database management and here he comes, throwing around vague buzzwords like interoperability and defending specifications that ended up having to replace their core structure -- the document -- and replacing it [with something that is not even well defined] and I am the fuzzy one? Interoperability is one of those terms that, together with middleware (see below), integration and so on, are to the IT industry what motherhood and apple pie are to US culture: everybody's objectives precisely because they mean nothing in the abstract and sound good in marketing. An agreed file format for data exchange and a general database theory both serve interoperability in some sense, but that is [neither here, nor there].
“If you look at the XML Query Home Page [] you’ll see approximately two dozen implementations of the XML Query draft, including a number of open source ones. If you look at the public mailing list for comments, you’ll see we received over 1100 detailed technical comments at the last public review. So there’s a lot of interest in this work. Why is that? One reason is that, like Web services and SOAP, XML Query is able to replace a lot of proprietary and hard-to-maintain middleware. Another reason is that for the first time we’ll have a standard way to search over multiple kinds of data source.”
Given the state of foundation knowledge in the industry -- of which this exchange is, sadly, representative -- great interest in the industry's latest fad is not an indicator of correctness or usefulness; I would even venture to argue that more often than not just the opposite is true. The IT industry and media operate like the fashion industry, hyping one ad-hoc fad after another that lack scientific soundness, inducing uneducated users to adopt or "be left behind". This explains the rush to  draft XML/XQuery specifications, without a proper understanding of data fundamentals and with disregard for theory [with predictable consequences, as the fate of XML DBMS proves].

It’s never been clear what middleware is (now, there’s a fuzzy term), but I do know quite well what data exchange and database management are. XQuery is the sort of thing you would come up with if you did not understand the difference between them, between syntax and semantics, between a logical and physical structure, you would talk about "searching over multiple kinds of data sources" -- another vague promise with devil in the details of practical implementation.
“Don[ald Chamberlin, an author of SQL] is the primary editor of the XQuery language, but the technical decisions reflected in the specification are a result of collaboration, and are agreed on by a consensus process by a much larger number of participants. The goal is to make a language that people agree to implement and to use. With support announced by Microsoft, Oracle, IBM, BEA and others (see Web page mentioned above) and judging by the public interest, I think it’s fair to say that’s going to happen.

It’s pretty rare to see a large complex system that everyone is happy with. It’s actually pretty rare to see a small system that everyone is happy with. There are people who are unhappy with some features in the Unix cat program, but it’s better to have cat in every Unix system than to have millions of shell scripts break on systems where it’s missing! The trick, then, is often to include features that will lead to massively wider adoption, even if some people would rather be without them.

Then we have (as part of W3C Process []) a public call for implementations so that we can test to see how confident we are that all the major features can be implemented compatibly (i.e. interoperably) in multiple independent implementations. Features that were not implemented get removed before the specifications are final.”
[These comments describe fundamental flaws of the process, rather than address my criticism of their implications. This is exactly how SQL was designed, hence the title of my article.] As I argued so many times, it is hard enough to design a proper language by committee, let alone by a committee of vendors who already have different implementations in the market. Standard committees are, indeed, political (and essentially commercial by virtue of their membership) entities, all the more reason to insist on reliance on a scientific basis, whenever available -- no consensus or public input can substitute for that. In my article I provided evidence that Chamberlin’s understanding of data and relational fundamentals is poor; and, just like the ANSI/ISO SQL committees, that is likely true of the W3C XQuery work group too, which sure is confirmed by its output.
“Is XML Query a waste of time? Is XML evil? Is SQL evil? A lot of people think otherwise, and some of them are pretty smart, so if you are concerned, take the time to read the specs and decide for yourself.”
Evil? How infantile. Waste of time? Worse: it is technological regression, as the last Chamberlin quote in my article makes clear his poor grasp of the RDM -- he even seems unaware that Codd invented it precisely to avoid old deficiencies of DGM that XML/XQuery re-introduces.
“In a relational database, the rows of a table are not considered to have an ordering other than the orderings that can be derived from their values. XML documents, on the other hand, have an intrinsic order that can be important to their meaning and cannot be derived from data values. This has several implications for the design of a query language. It means that queries must at least provide an option in which the original order of elements is preserved in the query result. It means that facilities are needed to search for objects on the basis of their order, as in ‘Find the fifth red object’ or ‘Find objects that occur after this one and before that one.’ It also means that we need facilities to impose an order on sequences of objects, possibly at several levels of a hierarchy. The importance of order in XML contrasts sharply with the absence of intrinsic order in the relational data model.”
You can read specs to kingdom come, but without foundation knowledge you won't be able to assess them or draw meaningful conclusions. While I never said in my article, as is claimed, that the W3C members are stupid -- I only inferred their knowledge from their output and pronouncements -- based on this exchange [e.g., can you tell how many fallacies are squeezed in just the last paragraph?], it is very difficult to withstand the temptation to reconsider.

No comments:

Post a Comment

View My Stats