Saturday, June 29, 2013

Site Update


1.
On the SCHEDULE:

A private database design seminar, October 9-10, Milan, Italy (sponsored by Microsoft and SolidQ)

A public presentation to the SQL Server User Group Italy (UGISS), October 8, Milan, Italy.

Details forthcoming. Contact Davide Mauri @SolidQ.


2.
The 'Quote of the Week' is an online question that is too long to post to the QUOTES page, so I posted the link to the exchange it initiated.


3.
A 'To Laugh or Cry' item was posted on the LAUGH/CRY page.

Remember my claim that we are regressing to this?


4.
A link to an exchange I participated was posted on the FP ONLINE page.


5.
Life and Work of Ted Codd (YouTube)

Is everything accurate?


6.
And now for something completely different.

How the Hum of a Coffee Shop Can Boost Creativity

The logical conclusion and real risk of digitizing everything in sight.


Tuesday, June 25, 2013

Data Model: Neither Business, Nor Logical, Nor Physical Model


Note: For a more in-depth discussion see Business Modeling for Database Design.

Chris Date once wrote a paper titled Models, Models Everywhere, Nor Any Time to Think, deploring the confused and distorted way in which fundamental concepts and terminology in general and relational ones in particular, are used in the industry. But no matter how many times a misconception is debunked, the abuse continues and will do so given educational failure and disregard for precision. Data model is a case in point (see Unmuddling Modeling, Parts 1,2) and What Is a Data Model?)

Thursday, June 20, 2013

Site Update


1.
The following were added to the SCHEDULE:
  • Private database design seminar, October 9-10, Milan, Italy (sponsored by Microsoft and organized by SolidQ)
  • Public presentation to the SQL Server User Group Italy (UGISS), October 8, Milan, Italy, organized by SolidQ.
2
The 'Quote of the Week' was posted on the QUOTES page.

3.
My latest All Analytics column was posted on the FP ONLINE page.

Two of the previously posted exchanges have new comments:
Different Types of DBMS
Comments on my Foreign Keys, Part 2 The Costs of Application-Enforced Integrity

4.
A 'To Laugh or Cry' item was posted on the LAUGH/CRY page.

5.
A link to an exchange I participated was posted on the FP ONLINE page.

6.
And now for something completely different
European data protection watchdogs are closing in on Google, with Spain charging the software giant with six legal infringements punishable by up to €1.5m (£1.3m) in fines, while France has given it three months to rewrite its privacy policy.
...
On the same day, France's Commission Nationale de l'Informatique et des Libert├ęs (CNIL) gave Google formal notice that it risks a fine of up to €150,000 and a second of €300,000 if it fails to rewrite its privacy policy within three months.
--Google and privacy: European data regulators round on search giant
I'm sure this will put stop to abuses cold.



Sunday, June 16, 2013

Foreign Keys Part 3: Mandatory or Optional?


In the last two posts I commented on an exchange about foreign keys. In this third and last installments I will address one last issue underlying that exchange: Are FK's mandatory or optional?
... [we] wish to make a point. There is something which is bad design/good design/mandatory/optional. Please stop insisting that Primary and Foreign keys are mandatory. They are good design habits but by no means mandatory.
We, relational proponents, believe in and advocate precision and clarity, which are particularly important in database management. What do the two terms mean in the context of FK's?

FK's are integrity constraints, specifically referential constraints. One would like to hope that when applied to constraints, the term 'optional' does not mean that integrity of data is a matter of mere preference. Rather, what is usually meant--and can be inferred from the exchange--is the locus of enforcement: not by the DBMS in the database (mandatory), but by users outside it (optional).

In the last installment in this series and many other writings I listed multiple reasons for DBMS-enforced integrity, but other than much arm-waving I have yet to see a solid justification for opting out of it. One that is mentioned most often is performance, but I mentioned in one of my recent site updates Stonebraker's lecture that validates my insistence that (physical) implementation and not the (logical) data model is the correct and effective target for maximizing performance.What is more, as long as constraints are enforced, there is no reason to expect better performance external to the DBMS--quite the contrary (why?).

Integrity constraints are an essential component of the data model (by which we mean neither a conceptual, nor a logical model of a specific enterprise). Constraints are the formalization of the informal business rules of any conceptual model. As such, they are not "good" design, they are design (see Business Modeling for Database Design). That's why one of Codd's 12 rules (and relational advantages) is integrity independence.

I cannot help but suspect that underlying the argument is, as usual, poor foundation knowledge. In my above mentioned paper I demonstrate what constraint formulation would entail in the absence of DBMS shorthands, such as FK's and why vendors should offer more than what SQL product do (see also Database Design, Relational Fidelity and SQL,Parts 1-3). It is very unlikely that anybody with a basic grasp of logic and the relational model would not be disabused of user enforced integrity as a good idea and would not resent its poor support by products.

Note: The difference between user-enforced and application-enforced integrity is that the former does not have to be the latter: constraints can be declared and enforced in a layer between the DBMS and applications, but it would still be external to the DBMS and make sense only if the DBMS fails to support shorthands, or the expression of certain constraints, as SQL products do).
However, life is much more complex than a Normalized DB structure. This includes tables serving as event logs; tables, serving as User maintained materialized query tables, tables, serving as supporting structures, reflecting state of complex transactional databases; persistent tables serving as Result Set or Session keepers.
One of my most frequent arguments is that it is precisely because reality is complex that (a) we should not pile up on it tool and practice complexities of their own (b) users cannot be expected to do a better job in formulating and enforcing constraints than the DBMS, particularly in the absence of a  background in logic. That is why I frown upon the trite argument "different DBMS's for different jobs". Codd's genius was an example of abiding by Einstein's famous principle: "everything should be as simple as possible, but not simpler!".

By the way, are we to conclude that tables of the kinds enumerated are not important enough to merit integrity?

It is ironic that who doubt the importance and advantages of full normalization, also question the necessity of DBMS-enforced integrity, because it is the former that minimizes the integrity burden on users (see The Costly Illusion: Normalization, Integrity and Performance).
And I personally believe that if they were truly mandatory, Sybase, Oracle, SQL Server, Ingres, DB2, etc. would require them. Oh, sorry, forgot the SQL standard itself. This is not the relational model we're talking about. These are commercially available RDBMSs which, not surprisingly, DO tend to listen to their customers. If they didn't, they wouldn't be in business!! Since Sybrand is unlikely to get FKs required by the SQL standard or the major RDBMS vendors, it seems that mandatory means that his answer to the question "Do I have to use foreign keys?" is "You would if you worked in my shop!". I'm inclined to agree with that.
Excellent example of the damage that poor foundation knowledge can do, complete with internally inconsistent arguments.

The implied market efficiency requires users and vendors with perfect information. It does not obtain by chance when that assumption does not hold. In the absence of independent criteria for assessing the functionality and quality of tools, vendors can listen to their customers, but if neither are sufficiently educated and informed, what they ask for and produced is not necessarily the most cost-effective.  If users assume that products are by definition the best, what exactly would take the vendors out of business if they fail?

There is hardly a better proof of point than the initial failure by the SQL standard to fully support the relational model in general and integrity in particular. Instead of insisting that integrity is not mandatory, users should deplore it.


Friday, June 14, 2013

Site Update


1.
The 'Quote of the Week' was posted on the QUOTES page.

2.
A 'To Laugh or Cry' item was posted on the LAUGH/CRY page.

3.
A link to an exchange I participated was posted on the FP ONLINE page.


4.
And now for something completely different.

The United States Is Still in an Extraordinarily Good Position 

Certainly true from his perspective.

Banks Reap Profits on Overdraft Fees as Customers Lose Money 

See what I mean?

The World Isn't Fair

He should know. And it works because he

Augmented Reality vs. Decimated Reality

helps ensuring we stay like this:

The laughable innocence of Facebook and Google (and us)

Incidentally, believing in corporate innocence is what public innocence is all about.





Sunday, June 9, 2013

Foreign Keys, Part 2: The Costs of Application-Enforced Integrity


In my previous Forward to the Past debunking of the exchange on foreign keys I promised that if I find any leftovers warranting comments, I will offer them. I did, there were, and here is a second installment.

I would like, first, to point out that even though the exchange is about FK's in particular--which, by the way, are referential integrity constraints--most of the discussion applies to integrity constraints in general, of which there are several kinds (see Business Modeling for Database Design).

The discussion hovered around whether FK's are mandatory or optional (which I may tackle in a future post). Here I will comment on some of replies triggered by the FK question, most of which demonstrated the value of foundation knowledge over those that were not grounded in it. They were readily distinguishable by the informed eye and if you cannot tell which is which, education is in order.

Thursday, June 6, 2013

Site Update


1.
The 'Quote of the Week' was posted on the QUOTES page.

2.
A 'To Laugh or Cry' item was posted on the LAUGH/CRY page.

3.
A link to an exchange I participated was posted on the FP ONLINE page.
Great idea, skeptical about success-it is against the societal grain. Societies are interested in conformism, not critical thinking.

4.
If this and many other such improvements are possible, what is the justification for still focusing on "denormalization for performance"?

In the same vein, why is Michael Stonebraker referring to "legacy relational DBMSs", while demonstrating that the performance limitations and solutions of current SQL systems have actually absolutely nothing to do with their being relational (which, in fact, they are not)? Indeed, everything is about implementation--how could it be otherwise? And he is one of the people who does know the fundamentals! Ah, yes, he is a vendor now.
  
To his credit, he rejects NoSQL for the right reasons and his solutions to the today's performance needs are sensible. But why does he want to preserve SQL, rather than come up with a TRDBMS? All those solutions, don't they validate our claim, for decades, that such a system can be excellent performer? Why, as an implementor, he did not design one?

Note: I happen to know what the solution is for the performance factors for which he does not have any, but unfortunately cannot say anything about it (it's deja vu TransRelational Model(TM) all over again).

5.
And now for something completely different.

Twitter's identity crisis

Zynga to lay off 18 percent of staff, shut offices, slash infrastructure

Facebook loses advertisers again

See a pattern? No? Does the following help?

Yahoo Shuts Down Mail Classic, Forces Switch To New Version That Scans Your Emails To Target Ads

How about this:

America, It's Time to Start Making Things Again

If you still don't, I've hinted about it in my last All Analytics post (see FP ONLINE PAGE). It was predictable.




Sunday, June 2, 2013

Forward to the Past: Foreign Keys -- Are They Really Necessary?



R: Do I Have to Use Foreign Keys? If I am already manipulating data properly, are foreign keys required? Do they have another purpose that I’m just not aware of? I appreciate the guidance!
If you are not sure what I mean when I so often deplore product self-training not preceded by education, this is a good example. Those who ask this question are not yet ready for database practice (and if they had a formal education, they should demand a tuition refund). Knowing a product's features is not enough for understanding why they are there, judge whether they should be there (and what others are missing), what are the pros/cons and how to exploit them intelligently (and work around what is missing).