THE FINAL NULL IN THE COFFIN
A LOGICALLY CORRECT SOLUTION TO MISSING DATA

 

 

*Thanks to Hugh Darwen for the permission to use this title, which does not imply endorsement of the ideas expressed in this seminar.

 

 

 

OVERVIEW

 

As attested to by the volume of writings and the heat of the debate on the subject, the treatment of missing data has possibly been one of the thorniest aspects of database management. Users are left between a rock and a hard place: they can either rely on SQL' s problematic version of three-valued logic based on NULLs, and risk complexity, unintuitiveness, hard to interpret database answers, and/or hard to detect errors in integrity enforcement and query results; or they must undertake the prohibitive burden of what is a complex database function that belongs in the DBMS, which is a lost cause.

 

 

OBJECTIVES

 

This seminar:

 

1. Summarizes

·          the drawbacks of the many-valued logic approach to missing data in general

·          SQL's problematic and poorly implemented flavor of three-valued logic via NULLs in particular

2. Proposes a solution that:

·          is logically correct

·          is within the relational framework

·          is fully supportable by a TRDBMS implemented via the TransRelational™ Model

·          does not require user intervention

 

 

OUTLINE

 

Ø       FUNDAMENTALS

·         Database Management

·         Levels of Representation

·         Interpretation

·         The Logic of the Real World

·         "Inapplicable" Values

·         Entity Supertype-Subtype

·         Missing Values

 

Ø       MANY-VALUED LOGIC

·         Codd's 4VL

·         The Information Principle

·         Realm Confusion

·         3VL vs. 2VL

·         3VL: Problems

·         3VL: Consequences

 

Ø       SQL NULLS

·         Representation

·         3VL and More

·         NULLs In Practice: Users

·         NULLs In Practice: Products

·         Caveat Emptor

 

Ø       A 2VL SOLUTION

·         Don't Assert Your Ignorance

·         Nonrelational "Union"

·         Correct But Incomplete

·         Missing Data

·         Whose Property?

·         Metadata

·         Known To Be Unknown

·         TRDBMS Support

§    Known To Be True

§    Known To Be Unknown

§    Simple Case

§    More Realistic Case

§    Manipulation: Projection

§    As Appropriate

§    Multi-R Results

§    App Presentation Option

§    Explicit Manipulation

·           Generality and Soundness

·           Advantages

 

Ø       2VL/RM vs. SQL

·         Natal DB: Conceptual Model

·         Logical Design for SQL

·         Complications

·         Consequences

 

Ø       THE TRANSRELATIONALTM IMPLEMENTATION MODEL

·         Relvar Proliferation

·         Extenuating Factors

·         Full Data Independence

·         Bring Them On!

 

Ø       MISCONCEPTIONS DEBUNKED

 

Ø       CONCLUSION & RECOMMENDATIONS

·         Perfect vs. Imperfect Knowledge

·         Soundness & Implementability

·         Practicality

·         Recommendations

 

 

AUDIENCE  

 

Anybody involved in data management, technical and not technical. Some data management background may or may not be helpful. The target audience includes (but is not limited to):

 

§   DBMS designers, implementers, and other vendor personnel

§   Database consultants

§   Data and database administrators

§   Product evaluators, acquirers and deployers

§   IT managers

§   Information modelers and database designers

§   Application developers and deployers

§   Data warehouse implementers

§   Members of the trade media covering data management

§   Academics specializing in data management topics

§   Students, graduate and undergraduate

 

 

DOCUMENTATION

 

Workbook containing the instructor’s slides and a copy of the PRACTICAL DATABASE FOUNDATIONS paper #8, which serves as text for the seminar.

 

 

INSTRUCTOR

 

Fabian Pascal has a national and international reputation as an independent technology analyst, consultant, author and lecturer specializing in data management. He was affiliated with Codd & Date and for 20 years held various analytical and management positions in the private and public sectors, has taught and lectured at the business and academic levels, and advised vendor and user organizations on data management technology, strategy and implementation. Clients include IBM, Census Bureau, CIA, Apple, Borland, Cognos, UCSF, and IRS. He is founder, editor and publisher of DATABASE DEBUNKINGS, a web site dedicated to dispelling persistent fallacies, flaws, myths and misconceptions prevalent in the IT industry. Together with Chris Date he has recently launched the PRACTICAL DATABASE FOUNDATIONS series of papers that also serve as text for his seminars. Author of three books, he has published extensively in most trade publications, including DM Review, Database Programming and Design, DBMS, Byte, Infoworld and Computerworld. He is author of the contrarian columns Against the Grain, Setting Matters Straight, and Test Your Foundation Knowledge.