Sunday, September 27, 2015

Weekly Update

UPDATE: I have posted, via David McGoveran, an update to last week's post on Codd's 12 rules.

Reactions to my presentation "The Real Science: Tables- So What?" to the Silicon Valley SQL Server User Group. 

With regards to Language Redundancy and DBMS Performance: A SQL Story:

1. Quote of the Week

... the challenges inherent in the SQL RDBMS [sic] approach ... the constrained schema (or schema-first) approach of SQL RDBMS engines imposes semantic infidelity rather than fidelity on all applications and services that depend on this RDBMS type, solely ... SQL RDBMS engines (as per what I've outlined above) do impose a "one size fits all" constraint on DBMS driven apps and services that manifests as the "data variety issue" outlined by the "Big Data" meme.

Tuesday, September 22, 2015

The Real Data Science: Tables -- So What?

My September post @AllAnalytics. 

We have seen that if database tables are designed to represent a set of (facts about) a single class of attribute-sharing entities each and to preserve the mathematical properties of relations, databases are easier to understand, and query results are guaranteed to be provably correct and easier to interpret. Let's see how and why with the help of an example.

Read it all. (Please comment there, not here)


Sunday, September 20, 2015

Interpreting Codd: The 12 Rules

Note: This is a re-write for consistency of both this post and the interpretation of the rules with the McGoveran formalization and interpretation [1] of Codd's true RDM.

I have recently come across an "explanation" of Codd's 12 rules for RDBMS in a book appendix posted online that is mostly a regurgitation of the rules, or incorrect -- typical for an industry lacking foundation knowledge [2].

Shortly after Codd published the RDM, vendors of hierarchic and network DBMSs that preceded it and SQL were adding the suffix /R to the names of their products and declaring them relational. Codd introduced these "quick rules of thumb" -- neither rigorous, nor systematic, nor complete, nor independent -- that identify some important specific criteria that need to be met by a RDBMS if it’s to be truly relational which, if missing from products, could disqualify relational claims.

Although they are no longer used, inquiries about them persist and with the current proliferation of non-relational products (e.g., NoSQL, graph DBMSs) there is value in understanding them. The closest the industry came to the RDM is SQL DBMSs which, despite poor relational fidelity, proved much superior relative to the complexity and inflexibility of preceding DBMSs. But the rules still expose the relational infidelity of SQL DBMSs that have not been addressed for four decades, while new RDM violations have been introduced.

We offer here our clarifications on the rules. For each rule, we:

  • Explain its intended objective;
  • Offer clarifications, some of which reflect our current understanding of the RDM -- distinct from conventional wisdom -- based on its dual theoretical foundation and a careful analysis of Codd's work;

Sunday, September 13, 2015

Weekly Update

The Real Data Science: Tables--So What?

My Presentation to Silicon Valley SQL Server User Group

6:30 PM, Tuesday, September 15, 2015

1065 La Avenida, Building 1
Mountain View, CA

Free and open to the public (+ pizza)
For details and RSVP see Meetup

1. Quote of the Week
You see, in Cassandra 1.x, the data model is centered around what Cassandra calls “column families”. A column family contains rows, which are identified by a row key. The row key is what you need to fetch the data from the row. The row can then have one or more columns, each of which has a name, value, and timestamp. (A value is also called a “cell”). Cassandra’s data model flexibility comes from the following facts:
* column names are defined per-row
* rows can be “wide” — that is, have hundreds, thousands, or even millions of columns
* columns can be sorted, and ranges of ordered columns can be selected efficiently using “slices”.
Compare this to the RDM.

2. To Laugh or Cry?

3. Online Debunkings

4. Elsewhere

5. And now for something completely different

View My Stats