ON WHAT A DATA MODEL IS *NOT*
with Fabian Pascal

 

 

 

From: Perry Valdez

Date: Jan 25 2006

 

I've read Dawn Wolthuis' latest blog post about what really is a data model, as used in the term "relational data model". 

 

She made the following points:

 

1. The implementation of a data model is a programming language.

 

She got that conclusion from her analysis of Chris Date's definition of a data model:

 

If we were to implement a data model what would we have? Let's take a look at a recent definition of data model from Date.

 

A data model is an abstract, self-contained, logical definition of the objects, operators, and so forth, that together constitute the abstract machine with which users interact. The objects allow us to model the structure of data. The operators allow us to model its behavior. (C. J. Date, AN INTRODUCTION TO DATABASE SYSTEMS, Addison Wesley, 8e, 2003, p 15-16)

 

I conclude from this that the implementation of a data model is a programming language, whether a general purpose programming language or not.

 

I'm not sure if data languages (e.g., SQL and Tutorial D) qualify as programming languages. But maybe in a broader sense, we can say that data languages are also programming languages in the sense that we use

them  to "program" (i.e., declare and manipulate) our data. So if only the relational data model had been implemented correctly, then the industry would have produced better data languages (i.e., the D languages). Am I right?

 

2. The RM is not necessary. It is not necessary for developing software solutions, maintaining large shared databases, or any other purpose in the world of software development. Any software solutions

that can be developed while employing the RM could be written without it, using other data models.

 

I don't know how she derived #2 from #1, and she did not state the consequences of not using RM. But she said that she will follow it up in a future post.

 

 

From: Fabian Pascal

 

You should stop paying attention to her. I made it clear what her problem is. Here’s more of her drivel [Ed. Note: I guess some thinking persons must learn the hard way they’re wasting their time.]

 

The reason Codd went for a data sublanguage was precisely because he wanted it to be based on First Order Logic—which is declarative and nonprocedural—and not involve the procedurality of programming languages. Tutorial D is supposed to be a computationally complete language, a deviation from Codd's intent.

 

Ed. Note: The computational completeness required of a programming language runs into self-referencing and Gödel’s undecidability. This is not a problem for a data language that does not have to be computationally complete.

 

It is possible to embed a declarative set-based data sublanguage into a general purpose computationally complete language, but extreme care must be taken to do it correctly (SQL is not it). It is also possible to have multiple data languages (syntax) concretizing the relational model (whatever a syntax is, it must be based on some data model). The objectives of RM can be achieved by the other two known data models (hierarchic and network), but in a much inferior way for all sorts of reasons. Which is why Codd invented RM in the first place!

 

To realize this is not that hard, but one must have (a) sufficient knowledge of the subject matter (b) some reasoning ability. Dawn has neither. She is a good example of what I call vociferous ignorami, a member of the Unskilled and Unaware of It Class that populates the IT industry, and american society in general.

 

 

Posted 3/17/06

© Fabian Pascal 2006 All Rights Reserved