Wednesday, December 30, 2015

Interpreting Codd: LOGIC FOR SERIOUS DATABASE FOLKS

David McGoveran has been working on a book on logic for database professionals. He will post articles online for review that will become chapters in the book. He has just posted first 4 along with a revised "Series Introduction" (link below).

I thought it's worth sharing a short note he sent me on what prompted the book and what some of its objectives are.

Wednesday, December 23, 2015

Documents and Databases



My December blog post @All Analytics

Don't let the label fool you. It's the relational data model (RDM), not SQL that NoSQL proponents really are rejecting. The main argument, advanced in a recent LinkedIn exchange, is that lots of information "cannot be represented in rows and columns". The implication is that the RDM is not general enough -- there are certain types of information that it cannot represent. The response from my colleague in RDM David McGoveran, is important enough to restate here.

(Comment there, not here, please. Thanks.)


                                                         Happy new year!






Sunday, December 20, 2015

Weekly Update

1. Quote of the Week
According to Wikipedia, Amazon's Redshift is a modified version of Postgres.
Maybe its speed redshifted data integrity into a bloody mess.

It has no primary keys, foreign keys or unique constraints. It just has optimizer hints in the DDL that *maybe* the data behaves that way. If they want to put hints in the DDL, OK, but don't call those hint PRIMARY KEY, FOREIGN KEY or UNIQUE.
--Unsupported PostgreSQL Features - Amazon Redshift

I don't think any version of Postgres lets you say Create Table Foo (Bar Int PRIMARY KEY) and then let you do
INSERT INTO Foo (BAR) Values (42)
INSERT INTO Foo (BAR) Values (42)
INSERT INTO Foo (BAR) Values (42)
That might be fine for a one time only static data warehouse, but an ongoing data update system is going to break the integrity rules, it is just a matter of time. That would make for some surprises when someone decides to migrate their data and lots of application code from a relational DBMS to Redshift.
--Jeff Winchell

Sunday, December 13, 2015

Moving in Circles: RDBMS-SQL Conflation & Logical-Physical Confusion

In my last post I demonstrated how disregard for the scientific foundation and history of a field, here, database management, leads to Moving in Circles. The piece I debunked was by CTO of VoltDB, one in the "VVV" series of products by Michael Stonebraker (MS). I've recently come across The Traditional RDBMS Wisdom is All Wrong, a presentation by the man himself, that reinforces my point.

Sunday, December 6, 2015

Weekly Update

 To all my Jewish readers:


I was making minor revisions to Surrogate Key Illusions when I ran across this: How To Find Duplicate Addresses For A PhysicianRowID

1. Quote of the Week

The entire Information Technology industry is still stuck in 1971 with Dr. Codd's 3rd Normal form ... when 3 more exist and now a 7th or (N) Normal Form ... Dr. Codd, created, devised, extrapolated 6 forms of Data NORMALIZATION, and after 45 years (1971), every single Database system or information management system to date has not exceeded the 3rd normal Form... except AtomicDB, (N) Normal Form improving on Dr. Codd's work by 4 levels(dimensions)at the very least ... There was and is a method to his theory, that with each level of Normalization brings a geometric increase in "Efficiency" and scalability, however, no one has even attempted the restrictions of the 4th normal form (no duplicates) let alone the 5th or 6th, and now with Dr. Everett’s (N)th Normal form, we can do anything the human mind can devise on a computer and in nearly real time. --Jean Michel LeTennier, CTO, Atomic Database Corp.

Friday, November 27, 2015

Science, Data Science and Database Science



November Database Fundamentals For Analysts blog post @All Analytics.


...
Einstein famously advised that everything should be as simple as possible, but not simpler. In a recent online exchange the relational data model (RDM) -- sound and comprehensible science -- was dismissed as “4th grade mathematics”, while a-theoretical complexity is promoted as science. The IT industry practices fail Einstein both ways.

An increasing gap between computational complexity and intellectual simplicity is not conducive to true science and, as Hawking has warned, it is a self-feeding, dangerous proposition. 

Read it all.




 

Sunday, November 22, 2015

Weekly Update

Housekeeping: Added If a table with a SK has a NK does it violate 3NF? to LINKS page.

1. Quote of the Week

One other intriguing benefit of NoSQL that I started to unwitting benefit from recently is the ability to push data scheme concerns entirely to the application layer. In this scenario, the applications use a NoSQL database predominantly as a storage service, lightly structured by a few indexed key fields. The object structured data document within the payload becomes transparent to the database. The applications then assume the role of enforcing and understanding the data scheme.

This approach allows the application architect to encode the data structures and meaning directly in the code that creates and consumes the data. So data structure changes required for functional updates can be implemented, tested, and deployed in the application code base with no updates to the database layer at all. (Of course, a conversion of existing NoSQL data documents may be required in situations.)

In this NoSQL approach, the removed translation of object data scheme to a relational structure and, and then back to an object structure again is a very welcome relief as well. --use-the-index-luke.com

Sunday, November 15, 2015

Moving in Circles: SQL for NoSQL

We've been there, done that.  

In Coming Full Circle: Why SQL now powers the NoSQL Craze Ryan Betts, CTO at VoltDB, argues that NoSQL products should adopt SQL for queries. I don't know about you, but to me it looks like a contradiction. Let me make it clear that my intention here is neither to defend SQL, nor to criticize it--I sure have done enough of that during the years--but rather strictly debunk the notion that its use with NoSQL systems is a good idea.

Sunday, November 8, 2015

Weekly Update

Housekeeping: Added
to LINKS page.
RAQUEL
to SOFTWARE section on Home page.

1. Quote of the Week

After attending NoSQL conference I am really hoping that companies think through this 'big data' implementation! No one there was interested in Data model ... and said so ... forget the data model ... even 'standards' were looked at 'its 'too early' for this new technology ... and no one could tell us anything about 'meta-data'...!!!!! --LinkedIn.com

Sunday, October 25, 2015

Weekly Update

1. Quote of the Week
The ER notation consists a set of constructs, such as, a rectangle to represent an entity types, an ellipse to represent an attribute a diamond to represent a relationship type, and so on. The RDS is a set of linear relation schemas. A relation schema has a name and is a sequence of attributes of text separated by comas and arranged horizontal. I have also developed an ER-to relational transformation algorithm to transform an ER schema to its corresponding RDS. I wish to implement this project as a CASE tool. --LinkedIn.com
The problem is not the student, but the quality of education he gets at his university.

Sunday, October 11, 2015

Weekly Update (UPDATED)

Housekeeping: Added following to LINKS page:


1. Quote of the Week
"The real definition of Big Data?? Simple: Whatever does not fit in Excel!"

Me: What is--precisely, pls!--the threshold from small to big data? And how do the structural, manipulative and integrity aspects change over the threshold?

HM: Big data := the smallest set of data for which the sensitivity of your transfer function is minimal, and such that the cardinality of this set is too large to implement said transfer function on a single physical machine.Transfer function := though not strictly a function, as in the Lambda calculus, it is more of an operator which ingests data of any size and produces a monetizeable product or service. Sensitivity := the degree to which a perturbation on the input into a transfer function affects the result produced by said function.

Me: Ugh! I'm sure that's exactly what all the overnight data scientists, including those who invented Big Data, had in mind.--LinkedIn.com

Sunday, October 4, 2015

Database Education: Oughts and OughtNot's

From an online exchange in response to A Tiny Intro to Database Systems:
C: As a non-CS grad coming fresh to databases, I found both the entity-relationship, and the object-oriented models confusing. Then I read Date [1] and Codd's [2] books and papers on the relational model, the one from the 1970s that is basically set and type theory applied to data, and found that to be a lot clearer and a more powerful abstraction to deal with your data model. As a non-"full time developer" it amazes me the number of "experienced" developers who are not aware of the relational model and who do not know what a foreign key is, or why referential integrity might be important.

For example, your Relational Model introduction has a discussion of various data types. But arguably, whether your integer is implemented as BIGINT or TINYINT is an implementation decision which should be separate from the model discussion (dixit Date). In other words, that attribute has a type of integer and how that integer is stored is a separate issue, and your RDBMS ought to abstract it away (as, I think, Postgres is pretty good with, and MySQL quite annoying). The beauty of the latest RDBMS developments, particularly in PostgreSQL world, is that the implementation has gotten so good that you don't need to really worry about it like you used to just a decade ago, at least in 95% of use cases.

I think one can teach SQL (and the relational model) to a non-developer in about 2 hours, because it is so declarative and intuitive. One day I'll go write that tutorial, as many clients need it sorely.

Sunday, September 27, 2015

Weekly Update

UPDATE: I have posted, via David McGoveran, an update to last week's post on Codd's 12 rules.

Reactions to my presentation "The Real Science: Tables- So What?" to the Silicon Valley SQL Server User Group. 

With regards to Language Redundancy and DBMS Performance: A SQL Story:

1. Quote of the Week

... the challenges inherent in the SQL RDBMS [sic] approach ... the constrained schema (or schema-first) approach of SQL RDBMS engines imposes semantic infidelity rather than fidelity on all applications and services that depend on this RDBMS type, solely ... SQL RDBMS engines (as per what I've outlined above) do impose a "one size fits all" constraint on DBMS driven apps and services that manifests as the "data variety issue" outlined by the "Big Data" meme.
--LinkedIn.com

Sunday, September 20, 2015

Interpreting Codd: 2. The 12 Rules (UPDATED)

I have recently come across an "explanation" of Codd's 12 RDBMS rules in a book appendix posted on line that is a set of mostly rule regurgitations. While they are no longer used to assess the relational fidelity of DBMS's, inquiries about them persist, yet they are still misunderstood.

In the current context of proliferation of non-relational products e.g. NoSQL, there is value in understanding the rules' origins and they can still help expose persistent flaws of SQL implementations and the superiority of RDBMS's over non-relational products. So here are the book "explanations", followed by mine.

Sunday, September 13, 2015

Weekly Update

The Real Data Science: Tables--So What?

My Presentation to Silicon Valley SQL Server User Group
 

6:30 PM, Tuesday, September 15, 2015

Microsoft
1065 La Avenida, Building 1
Mountain View, CA


Free and open to the public (+ pizza)
For details and RSVP see Meetup
.


1. Quote of the Week
You see, in Cassandra 1.x, the data model is centered around what Cassandra calls “column families”. A column family contains rows, which are identified by a row key. The row key is what you need to fetch the data from the row. The row can then have one or more columns, each of which has a name, value, and timestamp. (A value is also called a “cell”). Cassandra’s data model flexibility comes from the following facts:
* column names are defined per-row
* rows can be “wide” — that is, have hundreds, thousands, or even millions of columns
* columns can be sorted, and ranges of ordered columns can be selected efficiently using “slices”.
--http://blog.parsely.com/post/1928/cass/
Compare this to the RDM.

2. To Laugh or Cry?


3. Online Debunkings


4. Elsewhere


5. And now for something completely different


Sunday, August 30, 2015

Weekly Update

The Real Data Science: Tables--So What?

My Presentation to Silicon Valley SQL Server User Group
 

6:30 PM, Tuesday, September 15, 2015

Microsoft
1065 La Avenida Building 1
Mountain View, CA


Free and open to the public (+ pizza)
For details and RSVP see Meetup


1. Quote of the Week

[Do] formalized languages need the definition of data types? Up to now I have not read strong arguments against my statement that for interpretation and operation on data the use of character strings is sufficient when
  • All data are expressed as character strings that are explicitly based in language communities, whereas the character strings denote concepts that are represented by UID's;
  • The denoted concepts are defined by their supertype concepts (among others);
  • Collections of allowed qualitative concepts (that are denoted by string values or value ranges) are defined to enable the specification of constraints; --LinkedIn.com
2. To Laugh or Cry?


3. Online Debunkings

4. Elsewhere
Which technologies emerge from the abyss
Why Big Data gets it Wrong

5. Housekeeping Added to LINKS page:

  • Query Optimization
  • Relational Algebra
  • LEAP RDBMS
  • Relational
  • Relational Algebra Translator
  • System for Translating Relational Algebra Scripts into Microsoft SQL Server SQL Scripts
  • System for Translating Relational Algebra Scripts into Oracle SQL Scripts
And now for something completely different



Saturday, August 22, 2015

Silicon Valley SQL Server User Group Presentation


The Real Data Science: Tables -- So What? 


During hyping of fads such as "Data Science", all you hear is the "huge opportunities for enterprises to gain hitherto unimagined insights" and very little about the potential to tell enterprises really big lies, which can rise from 100% correct data in poorly designed databases. That's because what passes for "Data Science" is not science, let alone science of data. 

Most data professionals know that relational databases consist of tables, but so what? Provably correct query results are guaranteed by the real data science--the RDM--if and only if tables are well-designed and properly constrained R-tables and the DBMS truly and fully supports it. Unfortunately, more often than not tables are neither, and SQL DBMS's don't, which makes databases harder to understand,  queries don't always make sense and results are hard to interpret, or outright wrong. 

You will learn: 
  • The Real Data Science
  • Relations and databases
  • 5NF R-tables
  • "Table arithmetic"
  • RDM and SQL

6:30 PM, Tuesday, September 15, 2015

Microsoft
1065 La Avenida Building 1
Mountain View, CA
(map)


For details see Meetup.



Sunday, August 16, 2015

Weekly Update

1. Quote of the Week
Later, as use of RDBMS became more widespread, the complexity associated with design of a RDBMS was also well documented ... The associative database model is claimed to offer advantages over RDBMS ... “two fundamental data structures” as “„Items‟ and a set of „Links‟ that connect them together ... Items, which have "a unique identifier, a name and a type” and Links, which have “a unique identifier, together with the unique identifiers of three other things, that represent the source, verb and target of a fact that is recorded about the source in the database ... “each of the three things identified by the source, verb and target may each be either a link or an item.”
--Homan, J. V. and Kovacs, P. J., A Comparison of the Relational Database Model and the Associative Database Model, Issues in Informtion Systems, Vol. X, No. 1, 2009.
2. To Laugh or Cry?
A Comparison of the Relational Database Model and the Associative Database Model
3. Online Debunkings

4. Interesting Elsewhere
Why Domain Expertise is More Important than Algorithms
5. And now for something completely different

Thursday, August 13, 2015

Sunday, August 9, 2015

Surrogate Key Illusions

 Revised 7/26/16 (See Understanding Keys of 7/31/16 for a more in-depth discussion). 
"When defining a surrogate primary key for a table, two options are the most common: Integer and UniqueIdentifier (aka Globally Unique Identifiers, or GUID's) ... Historically, Integer has been the logical choice. It’s human-readable, requires minimal storage, and can be set as an identity (auto-incrementing) to prevent the need for additional application logic. UniqueIdentifier comes with significant disadvantages. The most immediately noticeable is that it’s user-unfriendly. You’ll never hear a user or developer ask you about record “A78383A3-4AB1-42CF-B3FC-A4A23AD10398”. With high availability and replication becoming highly prevalent, UniqueIdentifier is being chosen more often, but has caveats that mean it isn’t always the optimal solution."

Sunday, August 2, 2015

Weekly Update

1. Quote of the Week
I am designing a mySQL database. I created tables and added extra columns for future use. Will it affect performance?
--LinkedIn.com
2. To Laugh or Cry?
Why you should never, ever, ever use MongoDB
3. Online Debunkings
Fixing 7 common database design errors
4. From the industry
Amazon's MySQL database challenger Aurora exits preview
5. And now for something completely different

Saturday, July 18, 2015

Weekly Update

HOUSEKEEPING 

  • New Appendix to paper #3: While working on my book, I collected all comments by readers and replies by me (edited) and David McGoveran and added them as Appendix B. It further clarifies some of the aspects of the proposed relational/2VL solution to missing data. Those who ordered the paper in 2014 and 2015 should email me for a copy.
  • Added to LINKS page: 
Why even the most intelligent software architects don't understand the Relational Model

1. Quotes of the Week
In 15-20 years from now: Information will stay only in XML (no more tuples, no more objects). Imperative languages as we know them today (Java, C, C++, C#) will be gone. We will program with some extension of XQuery, or in any case a declarative dataflow/workflow language specially --Daniela Florescu, 2010 Interview
Exactly 20 years ago I wrote this article: "Storing and Querying XML Data using an RDMBS". I curse myself every day for doing so. I should be damned by the fires of hell for ever opening my mouth and letting people believe that one can REASONABLY use SQL to query hierarchical, complex structures like XML or JSON.  NO, PEOPLE. YOU CAN NOT! --Daniela Florescu, 2015, LinkedIn.com
2. To Laugh or Cry?
SQL Will Inevitably Come To NoSQL Databases
3. Online Debunkings
Data Scientists: The talent crunch (that isn't)
4. Interesting
5. And now for something completely different

Thursday, July 9, 2015

The First Half of Database Science for Analysts

 My July blog @All Analytics:

Database Fundamentals: The First Half of Database Science for Analysts

One would expect “data scientists” to be keen on the dual scientific foundation of database management -- the relational data model (RDM) -- but they know little beyond “related tables” and, in fact, complain that more often than not data “do not fit” into them. Much of that is the result of poor education and an almost exclusive focus on software tool training. Even the analyst intent on acquiring foundation knowledge is more likely to be misled than enlightened by published information.


Please comment there, not here!



 

Sunday, July 5, 2015

The SQL and NoSQL Effects: Will They Ever Learn? UPDATED

UPDATE: I refer readers to Apache Cassandra … What Happened Next. Note that this was an optimal use case for NoSQL. Read it focused on the simplicity of the data model and particularly physical data independence relative to RDM. 

In Oracle and the NoSQL Effect, Robin Schumacher (RS), a former "data god" DBA and MySQL executive now working for a NoSQL vendor claims that Oracle’s recent fiscal Q4 miss--a fraction of what's to come--is due to its failure to recognize that
"... web apps ushered in a new model for development and distributed systems that ... [r]elational databases are fundamentally ill suited to handle ... Their master-slave architectures, methods for writing and reading data, and data distribution mechanisms simply cannot meet the key requirements of modern web, mobile and IoT applications. I tell you that not as an employee of a NoSQL company, but as a guy who has worked with RDBMS’s for over twenty-five years. In short, you simply can’t get there from here where relational technology is concerned, and that’s why NoSQL must be used for the applications we’re talking about.

Sunday, June 28, 2015

Weekly Update

1. Quote of the Week
My feeling is that the field of NoSQL was created EXACTLY so the data should not be normalized like in relational databases--which has the disadvantages that data needed for real time/online applications needed to be joined at runtime before being used by the application. Under the time constraints of an online system, this is unacceptable. Hence, application developers want to store persistently the data EXACTLY in the way application see it: pre-aggregated, potentially inconsistent, and potentially replicated. Bottom line, there is no "rule" of how you should store the data. Just look at your application needs. Not everyone has the same requirements as iTunes or Netflix, so you don't need to copy their design.
...
If this is a question for you... maybe you shouldn't be using a NoSQL database in the first place !? Why do you think you need one and good old relational databases aren't good for you? Just because it's "fashionable" ? My point is: if you knew exactly WHY you need a NoSQL database, you knew EXACTLY how to structure your data for it.
--LinkedIn.com
With consistency gone, whatever is left?

2. To Laugh or Cry?

Data Modeling in NoSQL
3. Online Debunkings 
4. Elsewhere 
5. Added to LINKS page:
6. And now for something completely different
 

Sunday, June 14, 2015

The Cookbook Approach to Data Management


15 years ago I posted The Myth of Market Based Education @the old dbdebunk.com. Last week I deplored the substitution of tool training for education and increasingly young age at which it substitutes for education, preventing any independent and critical thinking rather than instilling it:
... a systemic problem that perpetuates itself without a solution and worsens rather than improves, particularly with Google, Facebook, Twitter and Microsoft getting involved in the school and academic systems.
Shortly thereafter
...the San Francisco School Board unanimously voted Tuesday to ensure every student in the district gets a computer science education, with coursework offered in every grade from preschool through high school, a first for a public school district. Tech companies, including Salesforce.com, as well as foundations and community groups, are expected to pitch in funding and other technical support to create the new coursework, equip schools and train staff to teach it.
Basic computer literacy, perhaps, but computer science for pre-schoolers? Tech companies have a unique notion of the "science"--witness "data science"--they want to impart to young children. This week's quote is a description of it by one of my readers as experienced by his son:

1. Quote of the Week

My son, who is a sophomore in high school, had a class in Microsoft Excel and Access this semester. This "class" was created and delivered online, in the classroom, by Microsoft for the school systems. His "instructor" is a baseball coach. Anyway, he asked me for some help with a portion of the Access module on queries. The "lesson," a set of step by step instructions with no explanations, instructs the student to use the "find duplicates" query wizard. Directly following that was the "find unmatched" (meaning in their terms rows in one table that should also be in another table but are not) query wizard. This is yet another example proving your point.
I rest my case. 

2. To Laugh or Cry?
Small Data - Too many relationships spoil the model...
3. Online Debunkings
Something doesn't make sense
4. Interesting Elsewhere
How to interview an Oracle DBA candidate (NOT)
5. And now for something completely different

Sunday, June 7, 2015

Forward to the Past: Sounds Familiar?

Working on a book of 2000-2006 material from the old dbdebunk.com, I came across the following 10/29/04 exchange. MySQL has probably improved--although, adding features post-hoc to products that were not explicitly designed for such upgrading is always problematic--more complex and limited than necessary. However, education and foundation knowledge have become worse and, from a foundational perspective, so have products and practices.
JG:  fell asleep dreaming of column constraints. I woke up thinking of foreign keys. I've been married to MySQL for so long that I had no idea all of these other things were possible!

Using a database and not knowing about foreign keys? My immediate reaction was to be astounded. However, he just happens to have begun with the least-robust database product on the market, and his learning is (evidently) confined to whatever product he happens to be using.
Astounded? Nah, standard operating procedure.

Saturday, May 30, 2015

Weekly Update


1. Quote of the Week

In this paper we briefly review some of these issues and then concentrate on the problem of generalizing the formal framework of the relational data model to include null values. A basic problem with null values is that they have many plausible interpretations.
--Database Relations with Null Values, Bell Labs, 1983
No, that's not the basic problem.

2. To Laugh or Cry?

Relational table naming convention
3. Online Debunkings

4. Interesting Elsewhere

5. And now for something completely different

Tuesday, May 26, 2015

R-table Constraints and Data Science

My May post @All Analytics:
R-table Constraints and Data Science
If you comment, please do so at the article itself, not here or on LinkedIn!
Thanks.








New Versions of All 6 Papers


I have just posted descriptions of all new versions of all six papers in the PRACTICAL DATABASE FOUNDATIONS Series:

#1: Business Modeling for Database Design
#2: The Costly Illusion: Normalization, Integrity and Performance
#3: The Last NULL in the Coffin: A Relational Solution to Missing Data
#4: The Key to Keys: A Matter of Identity
#5: Truly Relational: What It Really Means
#6: Domains: The Database Glue

The changes are significant and there are a few error corrections.

Since these are new versions, not revisions, the following applies:

  • Those who ordered in 2015 get free copies.
  • Those who ordered in 2014 get a 50% discount.
Please email me with proof of purchase.

For more details and how to order see PAPERS page.









Sunday, May 17, 2015

Weekly Update

1. Quote of the Week
He started his SQL Server career when he debuted as an accidental DBA in 2005.  Seeing Reporting Services 2005 demoed for the first time sealed the deal, and it has been all data ever since, leaving the worlds of networking and systems admin behind. After being a full-time dev/operational DBA with everything since SQL 2000, he is now back to BI, as a Senior BI Engineer/Consultant. --Online Bio
2. To Laugh or Cry?

3. Online Debunkings


4. Interesting Elsewhere

Obfuscated SQL Contest Winners!
H/t Todd Everett.  

5. And now for something completely different

Saturday, May 9, 2015

On OO Relational "Extensions"

In a LinkedIn thread that followed my Comments On Stonebraker Interview, Erwin Smout mentioned David Maier's 1991 critique of the 1990 Third Generation Data Base System Manifesto (3GM), of which Stonebraker was one of the authors. I was aware of the 3GM, of course, but had not read it because, at the time, it did not benefit from favorable reviews. I considered The Third Manifesto by Date and Darwen more significant, in part because it was authored by relational experts and because it was backed up by a proposed fully computational language with a fully relational component. But when Erwin mentioned Maier's piece, I asked him if he had a copy and he found a scanned PDF copy online.

Having not read the 3GM, I am not in a position to comment on Maier's critique thereof, but I would like to comment on the general topics in his Preliminaries that attracted my attention.

Saturday, May 2, 2015

Weekly Update

Housekeeping: I have added the following:

1. Quote of the Week

I am new to this domain. Please guide me to choose which database to choose among the NOSQL databases. Also which OS the database supports and how to add data to the database(which language). The requirement is to store pictures and alpha numeric s in database. A web server would be designed to extract data from the database and display in web application. The important requirement is scalability so I explored and found that NoSQL database will best fit the requirement. --LinkedIn.com
CJ Date calls this "I don't know how to do my job and am looking for somebody to do it for me."

2. To Laugh or Cry?

Docbase, Graphbase, Colubase, Triplestore ,which better fo RDF triples
3. Online Debunkings
4. Interesting Elsewhere
5. And now for something completely different

Sunday, April 26, 2015

Comments on Stonebraker Interview (UPDATED)

UPDATE: My paraphrasing of David McGoveran was not entirely accurate and the paragraph was revised.


Interviewed about his Turing Award, Michael Stonebraker is "modest" about his jointly-with-others contribution:
... the Ingres database [sic] brought Codd’s lofty relational ideas into the realm of ordinary individuals ... turned [them] into constructs that could be manipulated by ordinary people ... it was argued at the time that RDBMS couldn’t perform, but we showed it could be efficient.
He gives most of the credit to "Ted" Codd:
What Ted proposed was radical ... a complete change from how things were being done in database [sic] ... he turned the problem of data management into one of relations. That dramatically simplified things ... The conventional wisdom was that you should build for the particulars of how the data is stored. He saw that made no sense ... he [moved] the actual manipulation of data away from assembly language programming of the time to higher levels of abstraction that would later become structured query language, or SQL ... He brought principles of encapsulation and abstraction to programming databases, like with a high-level-language in programming.

Saturday, April 18, 2015

Weekly Update

1. Quote of the Week
To clarify my point further, although M doesn't care about how it's implemented, the implementation has a strong influence on the logical structures that it's trying to implement. In a normalized or demoralized [sic] debate, a fully normalized physical schema is always good, when implemented on an infinite performance hardware. --LinkedIn.com

2. To Laugh or Cry?
I recently attended a presentation on Azure DocumentDB, Microsoft's NoSQL cloud product. I made the following notes:

  • Polyglot persistence: Wasn't this what the RDM was supposed to substitute? 
  • Hierarchy: Didn't we get rid of HDM decades ago?
  • NoSQL: No SQL, but a "SQL-like" language (it's barely relational and now it's used for documents?)
  • No integrity, data independence: Nothing learned from the past.
  • Cloud: At least mainframes were under each company's control.
Progress.

3. Online Debunkings

Comments on "Michael Stonebraker Explains Oracle’s Obsolescence, Facebook’s Enormous Challenge"
4. Interesting Elsewhere
Unskilled and Unaware of It
5. And now for something completely different

Saturday, April 11, 2015

David McGoveran Interview

DBDebunk readers should know of David McGoveran (see his bibliography under FUNDAMENTALS), whose work on relational theory and practice has appeared or been discussed on the old site and here over the years. On more than one occasion I mentioned the Principle of Orthogonal Design (POOD) identified by David, who had published several years ago work he did on the subject with Chris Date. The POOD has relevance to updating relations and particularly views and led to Date's VIEW UPDATING AND RELATIONAL THEORY book .

I recently mentioned that David's and Date's understandings on POOD have diverged since their joint effort--currently Date and Darwen reject the POOD as formulated then and David has problems with Date's understanding of it and with their THE THIRD MANIFESTO (TTM) book.

David is working on a book tentatively titled LOGIC FOR SERIOUS DATABASE FOLKS where he will detail his views on RDM in general and POOD and view updating in particular, but in the meantime I asked him to publish an early draft of a chapter on the latter subject, which he did-- Can All Relations Be Updated?--and which he has just revised.

He has asked me to post a clarification on the nature of the differences with Date and Darwen (see next) and I used the opportunity to interview him about his impressive career, which covers much more than database management. David provided written answers to questions.