Thursday, February 28, 2013

Site Update

1.
The Quote of the Week was posted on the QUOTES page.

2.
A 'To Laugh or Cry item was posted on the LAUGH/CRY page. I usually prefer recent items, but once in a while I come across old ones that I just cannot resist.This is one is from 2003 and thingd have gotten worse.

3.
The link to my latest All Analytics column was posted on the FP ONLINE page.

4.
Links to online exchanges I participated in were posted on the FP ONLINE page.Here's a comment from one:
From my experience, that traditional model has changed as data warehouses are being driven to near real time business intelligence and used as a common repository for disparate systems. The separation between front line systems and data warehouses was due to software and hardware demands could not handle a mixed work load, minimizing costs, plus application products requiring different data stores. The world has moved on. There are DBMS's that can handle mixed work loads with enormous scalability. Application products are becoming broader in business features. Pricing models have changed.
So the whole idea of a distinction between operational databases and data warehouses significant enough to require distinct database technologies, let alone deviations from the relational model, has not exactly held water, has it? Which was pretty predictable.

5.
Roy Hann has posted a comment on my article on SQL redundancy: Fabian Pascal on Ingres

6.
An  old blog post that links to a page on my old site no longer available, so I don't know what the subject was, but something that makes sense, for a change:  How do we tell truths that might hurt

7.
An interesting read on renting software: You Will Subscribe To, Not Buy Software. Worth reading for some of the negatives of the Cloud which, as is usually the case, are disregarded when a fad is being pushed to extremes.



Sunday, February 24, 2013

Language Redundancy and DBMS Performance: A SQL Story

Recently I came across SQL: The way you write your query matters by Iggy Fernandez that refers to an old article of mine in which I compared the performance of five PC DBMSs executing seven different syntactic SQL formulations of the same query. I got wildly different timings, ranging from 15 seconds to 2500 seconds!

Wednesday, February 20, 2013

Site Update

1.
The Quote of the Week was posted on the QUOTES page.

2.
I guess referring this guy to temporal databases and 6NF will not help much, wouldn't you say?

3.
A 'To Laugh or Cry item was posted on the LAUGH/CRY page. Check out my last comment in the thread.

4.
A link to an exchange I participated in was posted on the FP ONLINE page.


Sunday, February 17, 2013

Forward to the Past: Application-Managed Data Not a Distributed DBMS Make

Sima Ilic: This may be a little unusual ask, but I'd be interested to hear you opinion on Google's evolution of distributed databases use/development: from Megastore to Bigtable to Spanner.

I know that there may be only a handful of companies that need (or have resources to use/develop) such things: Google, Amazon, Facebook? Unfortunately, people talk about it like it's the end of relational DBMS (which is plain nonsense) or the next thing that everybody should be looking at or using (the only word that comes to mind is false, but it's not strong enough for marketing/sales people).

Let me tell you what prompted the question.

Thursday, February 14, 2013

New Normalization Paper Available

A new version of paper #2, The Costly Illusion: Normalization, Integrity and Performance is available to order.

The new version is a complete rewrite, with new, better examples and many other improvements to the structure and content.

Details have been posted on the PAPERS page, which also provides pricing and ordering instructions.

Paper purchases, book orders via the site and donations help support this free site. If you find it useful, please contribute.

Thanks.


Site Update

1.
I will give the keynote address at the Northern California Oracle UG spring conference on Wednesday, May 22 at the CarrAmerica conference center in Pleasanton. I will also present one of my "To Laugh or Cry?" sessions. Full details forthcoming on the SCHEDULE page.

2.
The decision whether to link to a debunked article or not is a difficult one. On my old site I did not. On this site I've reversed the policy, but I have not been comfortable with it. Given that currently googling a title is a very simple and efficient way to find the item, I decided to list the title, but opt for selective linking based on the following criteria:
  • the overall substance must be over a certain threshold
  • not all the content is quoted in my debunking
The new policy was applied first to the last post.

3.
The Quote of the Week was posted on the QUOTES page.

4.
I came across Much Ado About Nothing that does a good job of demonstrating why I consider Hugh Darwen' proposed solution less practical than mine in The Final NULL in the Coffin. In particular, my solution relies entirely on the DBMS, not on users.

5.
A link to a To Laugh or Cry? item was posted on LAUGH/CRY? page.

6.
Links to several exchanges I participated in were posted on the FP ONLINE page.

7.
I am extremely wary of the so very liberal use of the term Data Architect, the inflation of positions so titled and of the professionals who present themselves as such. Do you know of any architect who designs the building, does the engineering blueprints and serves as building contractor?


Sunday, February 10, 2013

Those Who Don't Know the Past ...

It's been long my contention that a core problem of the database management field is poor foundation knowledge, in which I include familiarity with its history. Consider The Rise and Fall of the Third Normal Form. The title signals a rich debunking target. John D. Cook writes:
The ideas for relational databases were worked out in the 1970’s and the first commercial implementations appeared around 1980. By the 1990’s relational databases were the dominant way to store data. There were some non-relational databases in use, but these were not popular. Hierarchical databases seemed quaint, clinging to pre-relational approaches that had been deemed inferior by the march of progress. Object databases just seemed weird.

Thursday, February 7, 2013

Site Update

1.
Links to exchanges I participated in were posted on the FP ONLINE page. In one of them I deplored the major effort invested in mindlessly migrating from fad to fad, rather than on sound productive work. One example:

How to move configurable xml data types and data to Oracle database

2.
A new To Laugh or Cry? item was posted on the LAUGH/CRY? page. Has some relevance to each of the other items mentioned in this update.

3.
The Quote of the Week was posted on the QUOTES page. It is a comment on Iggy Fernandez's blog post, a link to which I posted last week on the FP ONLINE page and which I recommend reading.

4.
The author of Hipsters hacking on PostgreSQL writes that OTOH PostgreSQL was designed to be "a relational counterpart to Oracle and DB2", but it is increasingly being used "not because it's the easiest database to learn and use. It's not ... [or] because it's cool. It's not ... but because it gets stuff done."

I don't know how relational and easy to learn and use it is, but the real issue is are there any non-relational products that are easier to learn and use and get the same things done and if not, why not?

It does not seem to occur to anybody that this might have something to do with whatever relational fidelity its SQL implementation has. And I wonder if Stonebraker, who has lately been pronouncing relational technology obsolete and not up to current needs, and has developed several non-relational products not much heard of, is aware of the irony of his old product's success.

5.
On more than one occasion I criticized the academic substitution of industry fads for scientific research. Instead of leading the industry with science, academics rush to jump on every industry buzzword, a problem which Dijkstra deplored much more intelligently than I can.

Want more evidence? The previous item is one example. Here's another:

Scholarly articles for formal representation of NoSQL

And yet another, better one (detect any irony in the Bio?)

Harnessing Flexible Data in the Cloud

This has an historic precedent: the hierarchic and CODASYL (network) DBMSs were first inferred from existing practices and attempts were post-hoc made to give them a theoretical basis. This effort was subsequently abandoned when it proved too difficult and the result overwhelmingly complex and unusable. Few of today's IT professionals, academics and vendors are aware of this, which is why they are doomed to replicate the past.

6.
In my last update I posted in error on the FP ONLINE page a link to Martijn Evers' blog post instead of the LinkedIn thread that contained my comment on it.

So here is the link to Martijn's post Metadata as a perspective on data and my comment:
I am uncomfortable with the proliferation of concepts and terms at the informal conceptual level that confuse levels of representation, are vague and inconsistent and complexify unnecessarily.

Reality is complex enough without us piling up on it methodological, conceptual and tool complexity--everything should be as simple and parsimonious as possible (but not simpler!). The relational model achieved exactly that at the logical level. The only way to take advantage of it is to reciprocate at the conceptual level. Unnecessary conceptual complexification spills into the logical level and defeats the purpose.