THE FINAL NULL IN THE COFFIN*? OUTLINE OF A RELATIONAL SOLUTION TO MISSING DATA
Fabian Pascal PDBF Paper #4v2 (December 2005)

 

 

 

ABSTRACT

 


This is a thoroughly revised version of v. 1. It relies on terminology and concepts developed in the recently published Conceptual Modeling and Database Design: A Foundation Framework for Data Management (referred to as paper #2 for short), which is strongly recommended as a preamble.


 

As attested to by the volume of writings and the heat of the debate on the subject (see references), the treatment of missing data has possibly been one of the thorniest aspects of database management. Users are left between a rock and a hard place: they can either rely on SQL' s problematic version of three-valued logic based on NULLs, and risk hard to interpret database answers and/or hard to detect errors in integrity enforcement and query results, or undertake the prohibitive burden of what is a complex database function that belongs in the DBMS.

 

This paper summarizes the drawbacks of the many-valued logic approach to missing data, and SQL’s problematic and poorly implemented flavor of three-valued logic via NULLs, and proposes a possible solution within the two-valued logic/relational framework. It (a) separates unknown and therefore missing data from “inapplicable” and therefore nonmissing data), and provides proper design guidelines to avoid the latter (b) treats missing data correctly as metadata and (c) yields logically correct answers with respect to the real world, without the complications and problematics of many-valued logic and SQL’s NULLs.

 

It is also argued that the TransRelational™ Model of implementation, that facilitates the design of high-performance, fully data independent true RDBMSs, lends itself particularly well to the proposed missing data solution.

 

·   INTRODUCTION

·   THE LOGIC OF THE REAL WORLD

·   “INAPPLICABLE VALUES”: A RED HERRING

·   INTO THE UNKNOWN: THREE-VALUED LOGIC

·   NOT OF THIS WORLD: SQL NULLS

·   DON’T ASSERT WHAT YOU DON’T KNOW

·   MISSING DATA: DATA ABOUT DATA

·   2VL VS. SQL: A REAL-WORLD COMPARISON

·   “TOO MANY” R-TABLES?

·   THE TRANSRELATIONAL™ IMPLEMENTATION MODEL

·   SOME MISCONCEPTIONS DEBUNKED

·   CONCLUDING REMARKS

·   REFERENCES

 


This paper should be considered investigative in character. Further research is required at both the logical and implementation levels, but we believe that the idea is sound and implementable.


 

 

Use of Materials Policy

 

 

PRICING AND ORDERING