TRULY RELATIONAL - WHAT IT REALLY MEANS
Fabian Pascal PDBF Paper #1v2 (June 2005)

 

 

 

ABSTRACT

 

In the first paragraph of his first ever published exposition of the relational idea 36 years ago Codd made clear three critical advantages of a relational model of data:

 

·   A scientific, and therefore sound formal basis—logic and mathematics—for database management

·   Physical data independence

·   System-guaranteed integrity of data and query results.

 

He was also explicit about his intent to address deficiencies of the hierarchical and network (graph) approaches underlying commercial products at the time, which lacked those beneficial properties.

 

Yet from 1969 to date the industry has botched the concretization of Codd’s ideas, by implementing SQL-based products that bear limited resemblance to it, and violate relational principles left and right. Moreover, instead of correcting their mistakes, vendors—including IBM (where the relational model was invented), and Oracle (the first implementer of a SQL DBMS)—are currently regressing to the same costly and unproductive technology made obsolete by Codd’s innovation more than thirty years ago.

 

This is mostly due to the utter failure by industry and users to educate themselves on, understand, and appreciate the practical value of his contribution, and the huge cost of ignoring it. Indeed, young generations of practitioners are not formally introduced to the model, and are instilled with the notion that SQL products are relational. Driven by products rather than principles, academia fails to provide the necessary knowledge.

 

It is therefore imperative—and proper for this series, intended to make data fundamentals accessible to practitioners—to revisit Codd’s original work, reassert those aspects that have been ignored, recall those that were missed, clarify those that are opaque, correct misinterpretations as well as original mistakes, and settle some current disagreements over what the relational model really is.

 

This paper covers Codd’s seminal first two papers, Derivability, Redundancy and Consistency of Relations Stored in Large Data Banks (1969), and A Relational Model of Data for Large Shared Data Banks (1970), the latter being an important public revision of the former internal one, which contains changes and introduces new material.

 

·         INTRODUCTION

·         RELATIONS ON DOMAINS

·         RELATION REPRESENTATION

·         TIME-VARYING RELATIONS VS. RELVARS

·         RELATION INTERPRETATION

·         DATA SUBLANGUAGE

·         ATOMICITY, NESTED RELATIONS, AND NORMALIZATION

·         FOREIGN KEYS AND (FIRST) NORMAL FORM

·         OPERATIONS ON RELATIONS

·         KINDS OF RELATIONS

·         DERIVABILITY, REDUNDANCY, CONSISTENCY

·         DEBUNKING MISCONCEPTIONS

·         CONCLUSION

·         REFERENCES

·         ADD-ON: DAVID MCGOVERAN ON THE 1969 RELATIONAL OPERATIONS

 

 

Use of Materials Policy

 

 

PRICING AND ORDERING

Counter by WebCounter