Sunday, August 25, 2013

Site Update

1. Schedule update
September 23rd, 10:00am, San Francisco, CA
The CWA, Missing Data and the Last NULL in the Coffin
Presentation, Oaktable Conference, Oracle OpenWorld
October 8, Milan, Italy
Denormalization for Performance: A Costly Illusion
Public presentation, UGISS SQLSaturday
October 9-10, 2013, Milan, Italy
Business Modeling for Database Design
Private seminar sponsored by Microsoft and organized by SolidQ
Contact: Davide Mauri, SolidQ

2. Quote of the Week
How many software programs are mathematically provable. And yet everybody still writes software and for the most part it works. Relational theory and SQL was very important for establishing a standard across vendors to a point. And yet switching relational database vendors is still very expensive proposition because the standards don't address the features that users need and use everyday that are not part of the standard. At the heart of the system the relational model can still be enforced. But a product lives and dies not on whether it is mathematically provable but it's features set, efficiency and cost to develop in.

3. To Laugh or Cry?
Please help with my data model design
If this was student homework, it is an excellent example of how database management should not be learned and a validation of the substitution of the "cookbook approach" for education. Ironically it's in the forum's section "Relational theory". Had theory been taught, such questions would have not been asked. 

4. Two online exchanges I participated in
Predictable--it was just a matter of time. My latest post at All Analytics is quite apropos: Real Data Science: General Theories of Data.
In this context, consider In Silicon Valley, age can be a curse.

5. And now for something completely different

Not entirely unrelated:
Facebook boosts connections, not happiness study
The Curse of Self-Service (h/t Davide Mauri)

Sunday, August 11, 2013

Site Update

A while ago my friend Stephen Henley published his opinion on Missing Data, which questioned the thoughts--not well formed and definitive at the time--of C. J. Date, Hugh Darwen and myself on the subject. Since then Date has proposed a default values scheme which he has subsequently renounced; Darwen has published How To Handle Missing Information Without Using NULL and I proposed a relational solution in the recently revised paper #3, The Last NULL in the Coffin.

In this context, I dedicate this update (except the last item) to NULL. Whatever difference may exist among the above mentioned relational proponents, we do agree that it is certainly not a solution to the problems of missing data.

Time permitting, I may post some belated comments on Henley's piece.

If SQL is based on relational algebra which is based on set theory where the concept of null set (empty set) is an axiom of the theory. In this theory empty set is not the same thing as nothing. A point that confuses many people.

Relational algebra is based on 3VL predicates, that is, the answer to any predicate can have three states true, false or unknown. Unknown is caused by the use of a operator on an the absence of a value (null). Within relational algebra null is not to be treated as a value but merely a marker of unknown (absence of a value).

None of this is rocket science and I suggest doesn't result in bad implications. I suggest the so called "bad implications" are only introduced as people use null as a patch for problems for example the division by zero. indeterminate state, open ended ranges, data states to name a few. That is, the issue is not the concept of null but its abuse as a patch for other issues.


Why shouldn't we allow NULLs?,

3. An ONLINE exchange I participated in.

NULL Handling in Databases,

4. And now for something completely different.

An astonishing act of statistical chutzpah
Why Great Teachers Are Fleeing the Profession
The ABCs of MOOCs

What does this say about the educational system?

Monday, August 5, 2013

Test Your Foundation Knowledge

The gap between theory and practice, said one sage, is greater in theory than it is in practice. It is not easy to find a better validation of this assertion than in database management. Particularly because it uses the term 'theory' in the two different senses, the confusion of which inhibits the ability of data professionals to understand and appreciate the value of theory--specifically, relational theory--for database  practice. The first sense is "sound theoretical foundation", the second is "just a theory" and carries a somewhat pejorative connotation ("not practical").

This confusion is rooted in systemic and cultural factors that make education on data fundamentals--without losing either the rigor and precision of the theory that give it its usefulness, or the audience--one of the most difficult tasks.