Friday, July 10, 2020

Oldies But Goodies: Skyscrapers with Shack Foundations

Ed. Note: I am re-publishing some of the posts (slightly refined) from the old dbdebunk (2000-06) to demonstrate how well they hold up to this day, and how not much has improved in the industry -- quite the opposite. The following is the first editorial with which I started dbdebunk in 2000.

June 4, 2000

“Well, it's really a judgment call and I think a lot of experience comes into it. It's a little bit like building a shack. Say you want to build a skyscraper, and you started out building a shack and you just keep trying to add onto it. After a while you have this severe structural problem ... So there is a fallacy to the build-upon-a-simple structure approach. Sometimes you get up to three stories and you have to do some major structural changes, and I just accept that.”
--Wayne Ratliffe, developer of dBase
“Client Servers were a tremendous mistake. And we are sorry that we sold it to you. Instead of applications running on the desktop and data sitting on the server, everything will be Internet based. The only things running on the desktop will be a browser and a word processor. What people want is simple, inexpensive hardware that functions as a window on to the Net. The PC was ludicrously complex with stacks of manuals, helplines and IT support needed to make it function. Client server was supposed to alleviate this problem, but it was a step in the wrong direction. We are paying through the nose to be ignorant.”
--Larry Ellison, CEO, Oracle Corp.

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  SUPPORT THIS SITE
DBDebunk was maintained and kept free with the proceeds from my @AllAnalitics column. The site was discontinued in 2018. The content here is not available anywhere else, so if you deem it useful, particularly if you are a regular reader, please help upkeep it by purchasing publications, or donating. On-site seminars and consulting are available.Thank you.


- 01/14/20 Updated the LINKS page
- 01/04/20 Updated the POSTS page with the 2020 posts
- 12/08/19 Added two educational references on set theory to the LINKS page.

- 08/19 Logical Symmetric Access, Data Sub-language, Kinds of Relations, Database Redundancy and Consistency, paper #2 in the new UNDERSTANDING THE REAL RDM series.
- 02/18 The Key to Relational Keys: A New Understanding, a new edition of paper #4 in the PRACTICAL DATABASE FOUNDATIONS series.
- 04/17 Interpretation and Representation of Database Relations, paper #1 in the new UNDERSTANDING THE REAL RDM series.
- 10/16 THE DBDEBUNK GUIDE TO MISCONCEPTIONS ABOUT DATA FUNDAMENTALS, my latest book (reviewed by Craig Mullins, Todd Everett, Toon Koppelaars, Davide Mauri).

- To work around Blogger limitations, the labels are mostly abbreviations or acronyms of the terms listed on the FUNDAMENTALS page. For detailed instructions on how to understand and use the labels in conjunction with the that page, see the ABOUT page. The 2017 and 2016 posts, including earlier posts rewritten in 2017 were relabeled accordingly. As other older posts are rewritten, they will also be relabeled. For all other older posts use Blogger search.
- The links to my columns there no longer work. I moved only the 2017 columns to dbdebunk, within which only links to sources external to AllAnalytics may work or not.

I deleted my Facebook account. You can follow me:
- @DBDdebunk on Twitter: will link to new posts to this site, as well as To Laugh or Cry? and What's Wrong with This Picture? posts, and my exchanges on LinkedIn.
- @The PostWest blog: Evidence for Antisemitism/AntiZionism – the only universally acceptable hatred – as the (traditional) response to the existential crisis of decadence and decline of Western (including the US)
- @ThePostWest Twitter page where I comment on global #Antisemitism/#AntiZionism and the Arab-Israeli conflict.


Fads and Cookbooks

The IT industry is much like the fashion industry: it is driven by fads. And more often than not it profits from the accelerated obsolescence of fads. DBMS vendors (Oracle's CEO in particular) were "wrong" more than once before. But it's the users, not the vendors, who paid through the nose, because the industry, with help from the trade media, hype "new" fads -- which frequently are nothing but old fads relabeled -- with disregard for soundness and potential deficiencies. The Internet and the browser are as much a database management panacea as client/server, object orientation, "universal" and multidimensional DBMSs, data warehouses, and data mining were before them, which were preached with equal fervor.

The fact is that sound database technology and practices are prerequisites for correctness in data management, whether Internet-based or not. Sadly, however, the database field is in disarray. While this is, to a degree, true of computing in general, in the database field the problems are so acute that -- claims to the contrary notwithstanding -- knowledge and, therefore, technology are actually regressing!

Even a cursory inspection of problems encountered in database practice reveals that most are due to the persistent failure by both DBMS vendors and users -- including DBAs, application developers, and managers at both vendor and user organizations -- to educate themselves and rely on a sound foundation. Indeed, it is lack of proper education that makes fads and accelerating obsolescence possible in the first place! As Chris Date explains in the Foreword to my book:

“SQL [DBMS] deficiencies are, it seems to me, directly due to the widespread lack of understanding (not least on the part of vendors), of fundamental database principles. Certainly it is undeniable that they flout those principles in numerous ways. And the practical consequences are all too obvious: First, users must understand where the deficiencies lie; second, they have to understand just why they are deficiencies; third, they have to understand how to work around them; and fourth, they have to devote time and effort in persuading the vendors to remedy them. The trouble is, of course, users too tend to be unaware of those same fundamental principles and, hence, find themselves unable to carry out their side of the "contract" (a "contract" that should not have been allowed, or agreed to in the first place, of course). It's a vicious cycle. What is more, this sad state of affairs is not likely to change, given the apparent lack of interest on the part of the trade press -- itself ignorant of those same principles -- in trying to improve matters.”
Consider, for example, the following two cases, one raised by a novice:
“I need to store 40 pieces of unrelated information. Is it better to create [one] table w[ith one] record [and] 40 fields, or create [one] table w[ith] 40 records [and one] field?"
The other is raised by a consultant assessing a database constructed, purportedly, by experienced professionals:
“... finished testing a -- gasp, choke -- COBOL program for a software company whose main product is a well-known government contract accounting system ... Now th[e expletive deleted] database ... is replete with repeating groups, redundant fields, etc. On top of all that, because it is one of the central files to the entire system, there are literally hundreds of rules and relationships, all of which must be enforced by the dozens of subprograms that access it. I found so many violations of so many of these rules in this new subprogram, that I filled five single-spaced pages with comments and suggestions. And I probably missed [the more obscure problems]. Several [such problems], perhaps.”
They are not exceptions of how database work is approached these days and with what results, but the rule. What should be obvious is that
  • These are database -- not application!) problems, and fundamental at that;
  • They are general, not specific to this or that DBMS database;
  • Their consequences are hardly theoretical and quite severe.
  • No amount of expertise in any DBMS product, development tool, on any hardware platform can, in itself, resolve them.
Yet it is practically impossible to get the attention of database practitioners for anything other than product-specific cookbooks. Examples:
“I polled our [user group] membership last night about future topics. For the foreseeable future, we prefer to focus on Microsoft SQL Server 2000 topics exclusively."
“I don't disagree with your statement that the "lack of attention to database [foundation] issues can cause horrendous problems". However, that ... is not what the user group is about. Yes, database design and use is definitely a part of our world, but our focus is on Sybase's development tools, such as PowerBuilder, PowerJ, Enterprise Application Server, etc.”

Education vs. Training

The fad-driven, tool-focused, cookbook approach to data management is due in large part to the business culture in general, and the way in which IT practitioners are inducted into the field in particular. A vast majority are self-taught and start with some specific DBMS software (e.g. Oracle, Access, SQL Server) and tools (frequently imposed on them by their employer). Having not been exposed to general database concepts, principles and methods, they are either unaware thereof, assume that they are acquired implicitly by learning or working with the software or, most commonly, deem them "theory" and, therefore, without practical value. These misconceptions are exacerbated by a growing generation of Internet practitioners who know little beyond HTML, Java and XML (not even DBMS or tool software), and who, therefore, think that's all there is to know.

We should not expect anything different. The sole technical qualification for practically all positions is experience with some DBMS software and development tools (mainly programming) on specific platforms (hardware and operating systems). Nothing else. Examples:

“Title: Senior Database Architect
Qualifications: Minimum of 3 years with Oracle on Solaris. Working knowledge of Tuxedo. Use of database design tools such as ER/Win. Perl and scripting. Familiarity with Oracle 8, Oracle Parallel Server, Sun Clusters, C. At least 3 years of relevant experience.”

“Title: Database Analyst III
Experience: Five to nine years developing applications using a major industry-standard relational database system (e.g., Oracle, Sybase, Ingres). Necessary Skills: Oracle DBMS Server and Oracle Application (Web) Server on Windows NT Server; Designer 2000; Developer 2000; Oracle Reports; Oracle Graphics; and PL/SQL. Also a plus: experience with UNIX, VMS, SQR, HTML, JAMA, or JavaScript.”
Not only isn't foundational knowledge -- distinct from sheer experience with tools -- a job requirement, but more often than not it is actually a liability. Functions such as requirements analysis and database design are bundled together with database administration, application development and physical implementation and assigned to the (mythical) position of "programmer/analyst", without realizing that they require different skill and knowledge sets which are rarely found in one person, and which often interfere with one another (particularly with currently flawed DBMS products). If you wanted to build a house, would you hire a building contractor to design it?

In fact, under industry pressure there is little database education to be had. Product-specific training reigns supreme and even academic computer science programs are becoming increasingly vocational in character. Example:

“We are very interested in additional Oracle instructors, if that is something you can teach.”

“Does (the course) cover accessing a database via CGI, i.e. VB, Java, Perl, C++ access to SQL Server or Access DB? We're a CS dept, so not so interested in the user-developer side of things.”
An analogy can serve to drive the perils of this state of affairs home. Suppose you must select a personal physician and have two candidates: one educated in, among other things, some anatomy, biology and chemistry, and one trained in a "cookbook" approach -- identifying symptoms from a list and matching treatments from another. Chances are you will opt with the majority for the former, and for a very good reason: in the absence of knowledge and understanding of health fundamentals, serious problems can be expected. This is generally clear in most applied science fields except, it seems, database management.

Is there any wonder that practitioners, seasoned ones included, can't offer a useful definition of a database? That neither DBMS designers, nor technically proficient users have heard of crucial concepts such as data independence? That many believe that not only should duplicates not be prohibited, but that they are actually essential?

The consequences are visible all over the business world and are horrendous. A vast majority of database products and practices are riddled with flaws and unnecessary complications. Examples:

“You might ask what is wrong? Well, it is a client/server application, using a Sybase database (SQL Anywhere). The database server has a single login user DBA--using the default password. Every application user connects to the database via this login level, and security is handled by the front end - despite the fact that any semi-aware user could use MS Access to destroy any data. There are also about 300 tables in the database, with no indexes! Agreed there are primary key indexes created automatically by the database, but still... The front end is Visual Basic, which for me is OK, but there are at least three different data access methodologies, from ODBC-API to the latest ADO. But what is killing me, is that I seem to have been hired as a "bug-fixer", to me different than an engineer. They are in a position where release schedules are forcing a continual maintenance mode, rather than an admittedly necessary rebuild of some components.”

“In the short term you have two options a) disable referential integrity checking and make the change (not recommended unless you're willing to assume total responsibility for the data consistency checking yourself; and you have to ensure you have exclusive access to the database when you're doing this) b) use our [DBMSs] triggers and stored procedures to implement the referential integrity procedurally.”

A Vicious Cycle

Correcting this sad state of affairs is a nontrivial proposition, because it is a deep-seated cultural and systemic vicious cycle that is hard to break. It is much easier (and profitable) to go with the flow, rather than uphill against it. The vast majority of trade magazines, books, web sites, conferences, and education programs ignore fundamentals, rely exclusively on vendor sources in current vogue, and focus completely on product-specific cookbooking.

DBMS and tool vendors, database professionals and users, desire accurate answers from databases. Yet the vast majority are unaware that, per Hugh Darwen:

  • “A database is a set of axioms;
  • The response to a query is a theorem;
  • The process of deriving the theorem from the axioms is a proof;
  • A proof is made by manipulating symbols according to agreed mathematical rules.”
The proof, of course, can only be as sound as the rules implemented by the DBMS are. The DBMS and the database form, thus, a deductive logic system: new facts are derived from facts recorded in the database asserted by users to be true. The derived facts are true (query results are correct) if and only if:
  • The user assertions are true;
  • The implemented derivation rules are logically sound.
Neither is there awareness that correctness means that it is the function of the DBMS to guarantee the logical validity of query results and the function of data practitioners to ensure semantic consistency by designing databases in accordance to theoretical principles, otherwise all bets are off. This is guaranteed iff both the DBMS and the database are relational.

Because they have been socialized into and rewarded for ignoring fundamental principles as "theory" without practical value, practitioners are largely unaware that the tools they employ and the practices they induce cannot guarantee correctness.

As I have amply demonstrated and documented in my writings, is that a lot of what is being said, written, and especially done in the data management field -- whatever is left of it -- is increasingly confused, misleading, or outright wrong.

January 21, 2001


“I just finished reading your article Skyscrapers With Shack Foundations and I just wanted to express my admiration. It's refreshing to read an article which addresses some of the strange attitudes I, for one, I find in the computer world, especially the seemingly absolute lack of understanding at a theory level. I tend to feel like the kid in the old fable "The Emperor's New Clothes" but just bite my lip because I don't want to be labeled a 'crank'. I am one of 'them', a self-taught DB manager. I make no objection to your characterization of self-taught DB workers though, I agree completely.

I came into the DB world with an extensive background in logic and set theory, so I was one of the lucky ones. At the risk of sounding like I'm bragging, I very quickly understood the concepts behind RDBMSs in general to a degree which a lot of experienced professionals seem to lack. Suffice to say that the two horror stories you mention in the article (the new designer with '40 pieces of unrelated data' and the experienced designer trying to clean up a pre-existing mess) are very familiar to me.

I don't want to be tacky by going into personal stuff too much, but I think you will see the relevance to your article here ... I'm searching for work now, after quite a few years working with a FoxPro-based database system. As the system was implemented ... well, 'implemented' might be too kind a word ... 'fragmented' comes closer ... as the company's system existed, there were parts in FoxPro, Excel, Access, older 'complete business systems', etc. What I quickly figured out was that the basic structure was, in all cases, the same. Furthermore, by building up my part of it and relying on SQL as much as possible, it was quite possible to build tools which would be portable across the then-irreducibly-separate parts of the system.

I mean, all of them are just simply implementations of the SQL standard, which is basically just an implementation of set theory. They all work the same as far as actually relating data goes, they differ mostly in the GUI gadgets. (I know, from your point of view this is a crude simplification, but from where I was this was basically a revelation of the simplicity of the underlying logic.) [Ed. Note: A very crude simplification and more, indeed]. Anyway, from that point it was fairly simple for me to integrate different parts of the system to my own, whether I needed to use Visual Basic, FoxPro scripting, Access, or whatever.

So I find myself looking for work, and potential employers are looking at my resume, seeing 'Foxpro' etc., and saying, "well, we don't use FoxPro databases, we use Microsoft SQL Server databases. Sorry." And I find myself trying to explain to them, as nicely as possible, because I'm trying to get a job from them and you're supposed to be nice to people you're trying to get a job from, "they all work basically the same, you see? The interface and tools vary, true, but the underlying dynamic remains the same. Tables of data with relations based on unique fields, etc." This was often met with a blank stare; in one case the response was "well ... yeah ... I guess ... if you've got ODBC drivers or something ... "

Anyway. I don't know if this will leave you nodding your head in agreement, or shaking your head in horror at my hackerish approach, but I was glad to see my own opinions mirrored in your article.”

Fabian Pascal:
You may also want to check out my Against the Grain contrarian column.

Yeah, well, [the admiration is] deserved because [there is a horrendous] price to pay for calling a spade a spade -- not done much in this society -- and even your experience can't touch that [price]. We're in a very small minority, though.

You may be self-taught in db, but your background saved the day. Most practitioners lack precisely that. And, in fact, it's not their fault: this is an anti-intellectual society that not only does not reward independent, critical thinking, but actually punishes it! And it's pretty clear why: do you think there would have been acceptance, for example, of the result of the so-called [Bush-Gore] election, if people reasoned by themselves? Would they have bought [what] corporate marketeers push?

I have many more relevant cases than [is required as evidence and that] I can handle). I wouldn't be saying all the things I say without that evidence.

That's why I [stopped looking for db work]. And even if I had looked, I wouldn't have gotten it. They don't want [thinkers -- that's "not practical" --  but "doers", certainly not critical thinkers -- that's dangerous] ... I refuse to fool myself and to believe that there are many db jobs around that will be [done right].

Clearly we are in agreement. But it does not solve the problem, does it?

No comments:

Post a Comment

View My Stats