Wednesday, September 26, 2012

To Laugh or Cry?

To Laugh or Cry? items are specifically selected as lost causes (usually, the problems lie with either the author of the item, or with those who respond to it, or both). For that reason I normally do not comment on them. This week's piece is somewhat unusual in that the target is not the author, but rather a vendor. The following item was brought to my attention by Matt Rogish.

Diego Basch, I’ll Give MongoDB Another Try. In Ten Years.

How many companies do you know that are so refreshingly honest as to tell their customers that they simplify code, minimize bugs, speed up time-to-market and achieve performance gains by ... losing data? And blames them for incorrectly having chosen their product to boot? Some commenters even agree.

Taking the company's justification to its logical conclusion, not recording anything in the database (1) yields unbeatable performance (2) requires no code (3) reduces bugs to zero and (4) allows bringing 1.0 to market instantaneously with the product idea. Given the ease with which data practitioners/users are willing to undertake tasks that belong in the DBMS and believe they can outdo it, I wonder why vendors don't do just that.

Matt comments:
I actually got in an argument with the CEO/co-founder/lead developer of 10gen, the makers of MongoDB, about how MongoDB makes all the wrong trade-offs for all the wrong reasons. I think the conclusion of his argument (this was probably two or more years ago now, so the details are a bit fuzzy) was that MongoDB was all about "massive performance" and quick prototyping and if you wanted integrity and all that jazz you should use a different product.  He's right, in that if you want all the DBMS features we've taken for granted for the last thirty+ years (ACID, etc.) we shouldn't use MongoDB.

The problem is new developers entering the field don't know their history and are doomed to repeat it.

For quick prototyping, I can look at a NoSQL product sideways and kind of see how, sure, it may be easier to throw something quick and dirty against the wall to see what sticks with Mongo vs, say, a given SQL DBMS.
Points arising:

First, the important question to ask is how prominent the CEO's argument is in the company's promotional/sales material, relative to the claim that the product is a DBMS that manages databases.

Second, those familiar with my work know that lack of knowledge of data fundamentals and the history of the field has been my core peeve for years and that I blame it on the educational system eschewing education for training. The consequences are that young developers entering the field reinvent square wheels that were discarded decades ago. In commenting to my post Object Orientation, Logic and Database Management (to which I responded) Yiorgos Adamopoulos brought to my attention an old article. Here's the abstract:
We address the disparity between the intellectual preparation that is expected in traditional engineering as compared to that accepted in software engineering. Any beginning student of a traditional engineering discipline realizes that their first courses will be steeped in mathematics–calculus and physics in particular. These foundational tools underlie the practical aspects of their future career. At best a software engineering student will begin with a similar program; but such courses are the stuff of software applications, not of the business of software per se. We examine the history of traditional engineering and the corresponding transformation of educational expectations from a shop-culture to a school-culture. From the origins of symbolic algebra in the late 16th century, through calculus and mathematical physics, the basic sciences that support modern engineering were developed. Shadowing this progress, the educational establishment was struggling with how–or whether–to move the new theory into practice. We all know how that struggle turned out. We argue that a similar pattern must occur in software development, not because of some academic whim  but because the complexity of software demands that we expect higher standards. The critical problem in modern software is predictability: we need to know what to expect when we run a program or import software from the Net. Such expectations are ill-served by current techniques. At best, programs are conjectures, free of justifications and supplied "as-is." In this day of the virus such a cavalier attitude is indefensible. We will outline some mathematical foundations for software and illustrate their application to the interplay between program and specification. Many of these ideas are the result of early 20 century philosophers; ideas that developed into a constructive logic, and from there to a mathematical foundation for programming languages. The process that moved traditional engineering from an experience-based craft to a science-based discipline was a multi-century revolution. Thomas Kuhn ’s “The Structure of Scientific Revolutions” explored a similar process in the advancement science. In the final section we review his arguments for science revolutions; adapt them for traditional engineering; and then show how our proposed revolution in the engineering of software falls within this framework.
--John R. Allen, Whither Software Engineering
One of the consequences of this is not only that IT practitioners don't know the foundations of their field, but fail and refuse to appreciate them even when they are brought to their attention. As one of them expressed it (praphrased) "who cares if the DBMS follows mathematics when it follows the rules".

Third, this is not just a matter of missing features due to product immaturity. It is a fundamental failure to understand what data management really is. The industry gets away with it because practitioners suffer from the same lack of knowledge. This is a systemic problem which cannot be addressed by individuals or by this or that company.

No comments:

Post a Comment

View My Stats