Sunday, January 16, 2022

OBG: No Understanding without Foundation Knowledge Part 3 -- Debunking an Online Exchange 2

Note: To demonstrate the correctness and stability offered by a sound theoretical foundation (relative to the industry's fad-driven "cookbook" practices), I am re-publishing as "Oldies But Goodies" material from the old (2000-06) DBDebunk.com, so that you can judge for yourself how well my arguments hold up and whether the industry has progressed beyond the misconceptions those arguments were intended to dispel. I may revise, break into parts, and/or add comments and/or references, which I enclose in square brackets).

In Part 1 I debunked a review of my third book, which triggered an exchange @SlashDot.org critical of my article If You Liked SQL, You'll Love XQuery. Part 2 was the first part of a debunking of that exchange, to be completed in this and forthcoming Part 4.

Slashing a SlashDot Exchange Part 3

(first published in 2001 @DBazine.com)

The following comments being debunked are by the W3C XML Query Working Group's Activity Lead and by an academic. [The exchange took place when XML DBMS was one of the hottest fads as late as 2013.  Consider them in this context: where are XML DBMSs today?]

“The article seems to say ‘I don’t like SQL and I don’t like XML and I think XML Query is about merging them although I don’t understand it very well, so the people working on XML Query must be stupid, and in any case it’s easier to attack people than understand a specification.’ Perhaps that’s unfair, but it’s clear to me that the writer is a little fuzzy on the design goals of XML and also on the focus of SQL development over the past 10 or 15 years. In both cases the story is about interoperability.”

------------------------------------------------------------------------------------------------------------------

SUPPORT THIS SITE
DBDebunk was maintained and kept free with the proceeds from my @AllAnalitics column. The site was discontinued in 2018. The content here is not available anywhere else, so if you deem it useful, particularly if you are a regular reader, please help upkeep it by purchasing publications, or donating. On-site seminars and consulting are available.Thank you.

LATEST POSTS

01/08 OBG: No Understanding Without Foundation Knowledge Part 2 -- Debunking an Online Exchange 1

01/01 Schema and Performance: Never the Twain Shall Meet

02/08 Aded the 2021 to POSTS page.

12/17 OBG: No Understanding Without Foundation Knowledge Part 1: Reviewing a Book Review

12/11 Nobody Understands the Relational Model: Semantics, Closure and Database Correctness Part 4

LATEST PUBLICATIONS (order from PAPERS and BOOKS pages)
- 08/19 Logical Symmetric Access, Data Sub-language, Kinds of Relations, Database Redundancy and Consistency, paper #2 in the new UNDERSTANDING THE REAL RDM series.
- 02/18 The Key to Relational Keys: A New Understanding, a new edition of paper #4 in the PRACTICAL DATABASE FOUNDATIONS series.
- 04/17 Interpretation and Representation of Database Relations, paper #1 in the new UNDERSTANDING THE REAL RDM series.
- 10/16 THE DBDEBUNK GUIDE TO MISCONCEPTIONS ABOUT DATA FUNDAMENTALS, my latest book (reviewed by Craig Mullins, Todd Everett, Toon Koppelaars, Davide Mauri).

USING THIS SITE
- To work around Blogger limitations, the labels are mostly abbreviations or acronyms of the terms listed on the
FUNDAMENTALS page. For detailed instructions on how to understand and use the labels in conjunction with the that page, see the ABOUT page. The 2017 and 2016 posts, including earlier posts rewritten in 2017 were relabeled accordingly. As other older posts are rewritten, they will also be relabeled. For all other older posts use Blogger search.
- The links to my columns there no longer work. I moved only the 2017 columns to dbdebunk, within which only links to sources external to AllAnalytics may work or not.

SOCIAL MEDIA
I deleted my Facebook account. You can follow me:
- @DBDdebunk on Twitter: will link to new posts to this site, as well as To Laugh or Cry? and What's Wrong with This Picture? posts, and my exchanges on LinkedIn.
- @ThePostWest on Twitter where I comment on global #Antisemitism/#AntiZionism and the Arab-Israeli conflict.

------------------------------------------------------------------------------------------------------------------

Wow! I do not understand??? [Upside down and backwards, but I leave it to the reader to judge. What is one to make of [somebody claiming to be a data managemet leader] whose conclusion from reading my article reduces to "Fabian Pascal doesn’t like SQL and doesn’t like XML", completely ignoring the evidence and reasoning? He seems unable to differentiate between reasoned substantive criticism and personal dislike and attacks.

Where in my article did I say anything about "merging XML and SQL" and what exactly does that mean? [XML is hierarchic and, thus, grounded in the directed graph data model (GDM), while SQL is -- at least in intention -- a relational data sublanguage grounded in the relational data model (RDM), a superior alternative to GDM for "non-network applications" -- what sense does it make to merge them?] Before we come up with any specifications aren’t we supposed to be clear about what exactly it is that we are specifying and for what purposes?

Perhaps that’s unfair? There I was, deploring the lack of the soundness of logic and math as a scientific foundation for database management and here he comes, throwing around vague buzzwords like interoperability and defending specifications that ended up having to replace their core structure -- the document -- and replacing it [with something that is not even well defined] and I am the fuzzy one? Interoperability is one of those terms that, together with middleware (see below), integration and so on, are to the IT industry what motherhood and apple pie are to US culture: everybody's objectives precisely because they mean nothing in the abstract and sound good in marketing. An agreed file format for data exchange and a general database theory both serve interoperability in some sense, but that is [neither here, nor there].
“If you look at the XML Query Home Page [w3.org] you’ll see approximately two dozen implementations of the XML Query draft, including a number of open source ones. If you look at the public mailing list for comments, you’ll see we received over 1100 detailed technical comments at the last public review. So there’s a lot of interest in this work. Why is that? One reason is that, like Web services and SOAP, XML Query is able to replace a lot of proprietary and hard-to-maintain middleware. Another reason is that for the first time we’ll have a standard way to search over multiple kinds of data source.”
Given the state of foundation knowledge in the industry -- of which this exchange is, sadly, representative -- great interest in the industry's latest fad is not an indicator of correctness or usefulness; I would even venture to argue that more often than not just the opposite is true. The IT industry and media operate like the fashion industry, hyping one ad-hoc fad after another that lack scientific soundness, inducing uneducated users to adopt or "be left behind". This explains the rush to  draft XML/XQuery specifications, without a proper understanding of data fundamentals and with disregard for theory [with predictable consequences, as the fate of XML DBMS proves].

It’s never been clear what middleware is (now, there’s a fuzzy term), but I do know quite well what data exchange and database management are. XQuery is the sort of thing you would come up with if you did not understand the difference between them, between syntax and semantics, between a logical and physical structure, you would talk about "searching over multiple kinds of data sources" -- another vague promise with devil in the details of practical implementation.
“Don[ald Chamberlin, an author of SQL] is the primary editor of the XQuery language, but the technical decisions reflected in the specification are a result of collaboration, and are agreed on by a consensus process by a much larger number of participants. The goal is to make a language that people agree to implement and to use. With support announced by Microsoft, Oracle, IBM, BEA and others (see Web page mentioned above) and judging by the public interest, I think it’s fair to say that’s going to happen.

It’s pretty rare to see a large complex system that everyone is happy with. It’s actually pretty rare to see a small system that everyone is happy with. There are people who are unhappy with some features in the Unix cat program, but it’s better to have cat in every Unix system than to have millions of shell scripts break on systems where it’s missing! The trick, then, is often to include features that will lead to massively wider adoption, even if some people would rather be without them.

Then we have (as part of W3C Process [w3.org]) a public call for implementations so that we can test to see how confident we are that all the major features can be implemented compatibly (i.e. interoperably) in multiple independent implementations. Features that were not implemented get removed before the specifications are final.”
[These comments describe fundamental flaws of the process, rather than address my criticism of their implications. This is exactly how SQL was designed, hence the title of my article.] As I argued so many times, it is hard enough to design a proper language by committee, let alone by a committee of vendors who already have different implementations in the market. Standard committees are, indeed, political (and essentially commercial by virtue of their membership) entities, all the more reason to insist on reliance on a scientific basis, whenever available -- no consensus or public input can substitute for that. In my article I provided evidence that Chamberlin’s understanding of data and relational fundamentals is poor; and, just like the ANSI/ISO SQL committees, that is likely true of the W3C XQuery work group too, which sure is confirmed by its output.
“Is XML Query a waste of time? Is XML evil? Is SQL evil? A lot of people think otherwise, and some of them are pretty smart, so if you are concerned, take the time to read the specs and decide for yourself.”
Evil? How infantile. Waste of time? Worse: it is technological regression, as the last Chamberlin quote in my article makes clear his poor grasp of the RDM -- he even seems unaware that Codd invented it precisely to avoid old deficiencies of DGM that XML/XQuery re-introduces.
“In a relational database, the rows of a table are not considered to have an ordering other than the orderings that can be derived from their values. XML documents, on the other hand, have an intrinsic order that can be important to their meaning and cannot be derived from data values. This has several implications for the design of a query language. It means that queries must at least provide an option in which the original order of elements is preserved in the query result. It means that facilities are needed to search for objects on the basis of their order, as in ‘Find the fifth red object’ or ‘Find objects that occur after this one and before that one.’ It also means that we need facilities to impose an order on sequences of objects, possibly at several levels of a hierarchy. The importance of order in XML contrasts sharply with the absence of intrinsic order in the relational data model.”
You can read specs to kingdom come, but without foundation knowledge you won't be able to assess them or draw meaningful conclusions. While I never said in my article, as is claimed, that the W3C members are stupid -- I only inferred their knowledge from their output and pronouncements -- based on this exchange [e.g., can you tell how many fallacies are squeezed in just the last paragraph?], it is very difficult to withstand the temptation to reconsider.




Saturday, January 8, 2022

OBG: No Understanding Without Foundation Knowledge Part 2 -- Debunking an Online Exchange 1

Note: To demonstrate the soundness and stability conferred by a sound theoretical foundation (relative to the industry's fad-driven "cookbook" practices), I am re-publishing as "Oldies But Goodies" material from the old (2000-06) DBDebunk.com, so that you can judge for yourself how well my arguments hold up and whether the industry has progressed beyond the misconceptions those arguments were intended to dispel. In re-publishing I may revise, break into or merge parts and/or add comments and/or references that I enclose in square brackets). 

In Part 1 I debunked a review of my third book, which had triggered an exchange @SlashDot.org in reaction to an article of mine @DBAzine.com. This and forthcoming Parts 3 and 4 were my debunkings of that exchange, in which a W3C XML committee member and an academic -- who ought to have known better -- participated. 

Saturday, January 1, 2022

Schema and Performance: Never the Twain Shall Meet

One of the core objectives of this site (and my work) has been to demonstrate that there will not be progress in data management as long as the industry and trade media require and promote exclusively (mainly tool) experience in the absence of foundation knowledge. I have published and analyzed ample evidence that relational language and terminology are used without grasping what it actually means -- a good way to gauge lack of foundation knowledge.

Recently I posted a four part series titled "Nobody Understands the Relational Model" showing that even a practitioner steeped in the RDM does not really understand it. Consider now a practitioner's mistake at the beginning of career -- "a bad database schema and what it did to system performance" -- which, he claims, belatedly taught him a lesson. Hhhhmmm, did it, really?

Friday, December 17, 2021

OBG: No Understanding Without Foundation Knowledge Part 1 -- Debunking a Book Review

Note: To demonstrate the correctness and stability offered by a sound theoretical foundation (relative to the industry's fad-driven "cookbook" practices), I am re-publishing as "Oldies But Goodies" material from the old (2000-06) DBDebunk.com, so that you can judge for yourself how well my arguments hold up and whether the industry has progressed beyond the misconceptions those arguments were intended to dispel. I may revise, break into parts, and/or add comments and/or references, which I enclose in square brackets).

The following was my debunking of a review of my third book (originally published on 01/14/2001)

“Many of us ... do not think that harmony is the great goal, or unity or peacefulness, [and] actually quite like hard questions for their own sake, and enjoy ... the life of the mind. To the question of how to live, the answer is "by disagreement.” --Christopher Hitchens

Let me say, first and foremost, that as the subtitle of the book -- A REFERENCE FOR THE THINKING PRACTITIONER -- indicates, it is targeted at the minority of practitioners who think clearly, independently and critically. It should not be a surprise, then, that those not belonging to that (alas, very small) target audience don't see its practical value. As I said so many times, if my work gained mass appeal, I would wonder what I was doing wrong. This is the sad reality, whether we like it or not. In fact, to be consistent I will go one step further: I don't assume that positive reviews are any better than negative ones -- they are frequently grounded in as faulty reasoning and/or ignorance as the critiques.

Let me also make clear that I do not place all of the blame on  the individual database practitioners or users. Rather, problems are rooted in a systemic, much more profound societal and business culture that fails to instill and encourage foundation knowledge and independent, critical thinking, which not only does not reward, but actually punishes such. This is true to a degree in all societies, of course, but in the US the problem is much more acute (there can hardly be a better demonstration of the horrendous implications of this than how the election was covered, perceived and accepted by most of the press and public) [I wrote this prior to the last two elections -- I leave it to the reader to judge the steepness of the subsequent regress.]

Saturday, December 11, 2021

Nobody Understands the Relational Model: Semantics, Closure and Database Correctness Part 4

with David McGoveran

(Title inspired by Richard Feynman)

In Parts 1 and Part 2 we provided some clarifications following a discussion on LinkedIn about our contention that, conventional wisdom notwithstanding, database relations -- distinct from mathematical relations -- are by definition not just in 1NF, but also in 5NF, as a consequence of which the RA, as currently defined for 1NF closure, produces what the industry calls "update anomalies" and, thus, is not a proper algebra. In Part 3 we used that information to debunk some leftover misunderstandings in the discussion.

We conclude in Part 4 with comments on a private exchange that followed the public one on LinkedIn regarding the difference between the McGoveran (DMG) and Date and Darwen's (TTM)
interpretations of the RDM, which can be summarized as follows:

Sunday, December 5, 2021

TYFK: How Not to Explain the Relational Model

Note: Each "Test Your Foundation Knowledge" post presents one or more misconceptions about data fundamentals. To test your knowledge, first try to detect them, then proceed to read our debunking, reflecting the current understanding of the RDM, distinct from whatever has passed for it in the industry to date. If there isn't a match, you can review references -- reflecting the current understanding of the RDM, distinct from whatever has passed for it in the industry to date -- which explain and correct the misconceptions. You can acquire further knowledge by checking out our POSTS, BOOKS, PAPERS, LINKS (or, better, organize one of our on-site SEMINARS, which can be customized to specific needs).

“The key idea is "Parent-Child" relationship. Entities ~ Relations ~ Tables (tilde stands for "more or less like"). Concept of a Table resonates with most of the people just as everybody intuitively grasps a concept of "rows and columns” but might struggle with "tuples and attributes". Explain relations and relationships, 1:1, 1:N, N:N etc. Explain rationale for this way of collecting and storing data, touch upon data normalization, and tell a few anecdotes about cost of storage back in 1970 and Y2K problem it have caused; add that we have inadvertently created Y10K problem while fixing it (not exactly true but not wrong either). Show an ERD diagram, trace the relationships, introduce SQL, maybe run a few simple SELECT queries to help your listeners visualize it, including equijoin and ORDER BY. Save other JOIN types, data types and other, more advanced topics, and for the next encounter.”
--Quora.com

 An excellent example that validates my claim of lack of foundation knowledge in the industry: most "explainers" of RDM have acquired relational jargon, but do not know or understand it at all.

View My Stats