DATABASE DEBUNKINGS

Wednesday, June 3, 2026

MEANING: IT’S NOT IN THE NAMES!

Follow @DBDebunk Follow @ThePostWest

Note: In MEANING AND THE DATA MODELI mentioned a recent LinkedIn exchange that prompted that post. It also reminded me of an older post, which had triggered two exchanges: one @DBDebunk, and another @https://news.ycombinator.com/item?id=12437389&goto=news in which I participated at the time. I decided to update the older post and I strongly recommend the YCombinator exchange, particularly comments by catnaroek.

"Can you have 2 tables, VIEWS and DOWNLOADS, with identical structure in a good DB schema (item_id, user_id, time). Some of the records will be identical but their meaning will be different depending on which table they are in. The "views" table is updated any time a user views an item for the first time. The "downloads" table is updated any time a user downloads an item for the first time. Both of the tables can exist without the other."

Saturday, May 30, 2026

PAPER REV. 2: PRIMARY KEYS - A NEW UNDERSTANDING

Follow @DBDebunk Follow @ThePostWest

Table of Contents

Introduction

1. Entities, Properties, Names, and Identifiability

2. Relational Representation

3. Kinds of Keys

3.1. Candidate, Primary, and Alternate Keys

3.2. Natural and Surrogate Keys

4. Primary Formal Mandate

5. Primary Key Designation

6. Primary Keys and Performance

6.1. Indexes

Conclusion

Appendix: Primary Keys, Duplicates and SQL

Friday, May 22, 2026

MEANING AND THE DATA MODEL

Follow @DBDebunk Follow @ThePostWest

Recently I participated in a LinkedIn exchange initiated by the following statement:

“First-Order Predicate Logic is quietly breaking enterprise semantics. FOPL assumes meaning can be expressed as predicates applied to entities:

subject → predicate → object

That works beautifully in theory. It fails in practice. Because the predicate is where ambiguity hides. So you can build vast semantic graphs where:
– entities carry multiple meanings
– predicates encode multiple dimensions
– relationships look precise but aren’t disjoint

It scales connections. Not clarity.” –Robert Vane

Wednesday, April 22, 2026

PAPER REV. 2: RELATIONAL DATABASE DOMAINS

Follow @DBDebunk Follow @ThePostWest

Table of Contents

Introduction

1 Data Types

1.1 Abstract Data Types

2 Database Domains

2.1 Domain Adaptations

2.1.1 Attributes

2.1.2 Tuple States

2.2 Domain Definition

2.2.1 Type Specification

2.2.2 Domain Operators

3 Kinds of Domains

3.1 Base Domains

3.2 Primitive Domains

3.3 Atomic ("Simple") Domains

3.4 Derived ("Complex") Domains

4 DBMS Domain Support

Appendix 1: Complex Domain Example

Appendix 2: Type System

Appendix 3: Note on SQL Built-in Data Types

Tuesday, March 10, 2026

LOGICAL DESIGN: INTERPRETATION OF RDM SYMBOLIZED SETS

Follow @DBDebunk Follow @ThePostWest

As we explain in Logical Database Design (forthcoming), LDD assigns the meaning of terms in conceptual models (CMs)—properties, entities, groups, multigroup—to non-logical symbols of a formal logic theory.

If the theory is RDM, the symbols stand for sets—domains/attributes, tuples, relations, database—adapted for database management. For each CM the theory acquires an interpretation, which produces a LM (application) of the theory for database representation and manipulation.

Here are the adapted sets symbolized in RDM which acquire the interpretation of the terms in CMs.

Sunday, March 1, 2026

SEMANTICS, DATABASE RELATIONS, AND TABLES

Follow @DBDebunk Follow @ThePostWest

This was said years ago:

”Table (n.) – a collection of information (data?) describing a population of entities which possess some common characteristics, called attributes. -itis – “suffix denoting diseases characterized by inflammation, itself often caused by an infection.” ---------- from the Wikipedia Wiktionary.”

Tables are the building block of relational databases. Tables must generally be “normalized,” at least to 1NF. That may be an appropriate way to think of databases when implemented in a modern day DBMS. However, it is not the way the world thinks logically. People have no problem with commonly occurring phenomena such as:

· A multi-valued attribute, e.g., an Employee possesses multiple Skills.

· Many-to-many (M:N) relationships, e.g., as between Employees and Projects

· A relationship with attributes

even though our systems may. None of these situations can be handled directly in a relational database."

This just now, on LinkedIn (check out my comments).

“Putting to one side the argument that your data almost certainly didn't start out broken out in to tables, and it almost certainly isn't consumed that way either, here's the thing; MongoDB, if you squint, is essentially a relational database with an unorthodox take on first normal form and some great high availability and scalability features.” -- Graeme Robinson

Sunday, February 15, 2026

FACTS, ENTITIES, AND BUSINESS RULES

Follow @DBDebunk Follow @ThePostWest

“... In ORM there is no concept of an entity record (tuple), although relational tables can be automatically generated from an ORM model (furthermore, guaranteed to be fully normalized).” --Online comment

“Object Role Modeling (ORM) is a ...a fact-oriented modeling approach for specifying, transforming, and querying information at a conceptual level. Unlike [other modeling approaches] ... fact-oriented modeling is attribute-free, treating all elementary facts as relationships ... In practice, ORM data models often capture more business rules, and are easier to validate and evolve than data models in other approaches. --ORM.net

Monday, February 2, 2026

DATA MUDDLING

Follow @DBDebunk Follow @ThePostWest

Chris Date once published an article at the old DBDebunk titled “Models, Models, Everywhere, Nor Any Time to Think”. If you want to get a hold of what he meant then, you oughta do a search on the title now and see what you get.

The continuous proliferation of models is an indication and measure of the disregard, if not outright hostility of the industry to sound theoretical foundations. It keeps reminding me of a decades-old piece I posted in response to David Hay's critique of Ron Ross's then proposal of a “fact model” (yet another one) as an alternative to data model. It is more relevant than ever, which is why I decided to bring it up to date. The problem is so entrenched and widespread, that even those who try to address it fail to realize that they are victims of it too.

Hay correctly observed:

“In our industry, there is a strong desire to put names on things. This is natural enough, given the amount of information that we have to classify and deal with in our work. To give something a name is to gain control over it, and this is not necessarily a bad thing. The problem is when the name takes the place of true understanding of the thing named. Discourse tends to be the bantering of names, without true understanding of the concepts involved.”

In this industry, many of the names are just re-labeling, whether it fits or not. Here are a couple of exquisite examples of both cases:

“I was amused to read in [Ralph Kimball's] article that my own suppliers and parts database design was "a perfect, beautiful star schema!" When I first learned the term "star schema", my reaction was that a properly designed star schema would be nothing neither more, nor less than a properly designed schema per se (in other words, one that did obey those scientific principles of relational design that do exist). So to see RK say that my schema was in fact a star schema reminded me (I’m afraid) of Peter Chen’s original E/R paper, in which—among other things—he reinvented the concept of domains, but called them value sets, and then went on to analyze the relational model in terms of his own ideas and said “Look, domains are just value sets!” --C. J. Date

Note: Kimball's "star schema" is, of course, not a relational schema, but quite an attempt to avoid it, due to failure to distinguish application views of the database from the database schema.

Sunday, January 25, 2026

WHAT MEANING MEANS: BUSINESS RULES, PREDICATES, CONSTRAINTS, AND SEMANTIC CONSISTENCY

Follow @DBDebunk Follow @ThePostWest

“If we step back and look at what RDBMS is, we’ll no doubt be able to conclude that, as its name suggests (i.e., Relational Database Management System), it is a system that specializes in managing the data in a relational fashion. Nothing more. Folks, it’s important to keep in mind that it manages the data, not the MEANING of the data! And if you really need a parallel, RDBMS is much more akin to a word processor than to an operating system. A word processor (such as the much maligned MS Word, or a much nicer WordPress, for example) specializes in managing words. It does not specialize in managing the meaning of the words ... So who is then responsible for managing the meaning of the words? It’s the author, who else? Why should we tolerate RDBMS opinions on our data? We’re the masters, RDBMS is the servant, it should shut up and serve. End of discussion.” --Alex Bunardzik, Should Database Manage The Meaning?

Monday, December 22, 2025

Follow @DBDebunk Follow @ThePostWest

Tuesday, January 14, 2025

NEW PAPER: THE FINAL NULL IN THE COFFIN - A RELATIONAL SOLUTION TO MISSING DATA

Follow @DBDebunk Follow @ThePostWest

Friday, December 27, 2024

Ssason's Greetings!

Follow @DBDebunk Follow @ThePostWest

Monday, July 22, 2024

NEW PAPER: FIRST NORMAL FORM - A DEFINITIVE GUIDE

Follow @DBDebunk Follow @ThePostWest

Sunday, July 14, 2024

NEW PAPER: UNDERSTANDING THE REAL RDM - E.F. CODD 1969-70 PAPERS PART 2

Follow @DBDebunk Follow @ThePostWest

Table of Contents

Introduction
1. Logical Symmetric Access
2. Universal Data Sublanguage
2.1. FOPL vs. SOL
2.2. Relational Completeness
2.3. Computational Completeness and Hosting
3. Kinds of Relations
3.1. Expressible and Named Relations
3.2. Derived Relations
3.3. Data Storage
4. Derived Relations and Redundancy
4.1. Database Consistency
5. Database Catalog
Conclusion

Monday, July 8, 2024

NEW PAPER: UNDERSTANDING THE REAL RDM - E.F.CODD 1969-70 Papers Part 1

Follow @DBDebunk Follow @ThePostWest

Table of Contents

Series Preface
Introduction
1. Interpretation of Database Relations
1.1. Attributes as Constrained Domains
1.2. Time-Varying Relations
2. Representation of Database Relations
2.1. Physical Data Independence
2.1.1. Uniquely Named Attributes
2.1.2. Primary Keys
2.1.3. Relations and R-tables
3. Normalization
3.1. First Normal Form and “Simple” Domains
3.2. Normalization and Non-simple Domains
3.2.1. Foreign Keys
Conclusion

Monday, June 17, 2024

SQL AT 50, OR WHY THERE ARE NO RDBMS'S

Follow @DBDebunk Follow @ThePostWest

In "Codd Almighty! Has it been half a century of SQL already?" the Register's Lindsay Clark interviews "Donald Chamberlin, Michael Stonebraker and more" about the legendary programming [sic] language. Chamberlin with Raymond Boyce were the authors of "the 1974 paper SEQUEL: A structured English query language as a way of addressing data in IBM's newly proposed System R, the first database to embody Edgar Codd's paper describing the relational model for database management.”

C. J. Date, who worked at IBM at the time, has often stated that the designers of SQL never understood RDM, and I expressed a similar stance in If You Liked SQL, You'll love XQuery. This has had an extremely detrimental effect on database technology--regress rather than progress--none of which transpires in the interview. So here is my reality check take on what you would not know from the interview.

Saturday, June 1, 2024

SMS: DOMAINS & SQL

Follow @DBDebunk Follow @ThePostWest

I am working on entirely new papers (not re-writes) in the PRACTICAL DATABASE FOUNDATIONS series. I have already published two:

THE FIRST NORMAL FORM - A DEFINITIVE GUIDE
PRIMARY KEYS - A NEW UNDERSTANDING

available for ordering from the PAPERS page, and two more:

RELATIONAL DATABASE DOMAINS: A DEFINITIVE GUIDE
DATABASE RELATIONS: A DEFINITIVE GUIDE

are in progress and forthcoming, respectively.

In the process I am coming across common and entrenched industry "pearls" that I am using for my "Setting Matters Straight" (SMS) and "To Laugh or Cry" (TLC) posts on Linkedin. I do those posts to enable the few thinking database professionals left realize how scarce foundation knowledge is, and to illustrate fallacies that abound in the industry, of which they are unaware, and which the papers are intended to dispel.

Time permitting, I may expose and dispel some of those fallacies, treated in more depth in the papers, such that those thinking professionals can test their knowledge and decide whether the papers are a worthy educational investment.

Here's one.

“A domain in most SQL usage is essentially an alias name for an existing type + restrictions on an existing type that can be used in a column. As for an attribute, it's essentially a COLUMN in SQL, a field in other types of databases, etc.”

Can you identify the fallacies before you proceed?

Saturday, May 11, 2024

TLC: TABLES, DIMENSIONS & RDM

Follow @DBDebunk Follow @ThePostWest

I am working on entirely new papers (not re-writes) in the PRACTICAL DATABASE FOUNDATIONS series. I have already published two:

THE FIRST NORMAL FORM - A DEFINITIVE GUIDE
PRIMARY KEYS - A NEW UNDERSTANDING

available for ordering from the PAPERS, and two more:

RELATIONAL DATABASE DOMAINS: A DEFINITIVE GUIDE
DATABASE RELATIONS: A DEFINITIVE GUIDE

are in progress and forthcoming, respectively.

In the process, I am coming across industry common and entrenched "pearls" that I am using for my "Setting Matters Straight" (SMS) and "To Laugh or Cry" (TLC) posts on Linkedin. I do those posts to enable the few thinking database professionals left realize how scarce foundation knowledge is, and to illustrate fallacies that abound in the industry, of which they are unaware, and which the papers are intended to dispel.

Time permitting, I may expose and dispel some of those fallacies, treated in more depth in the papers, such that those thinking professionals can test their knowledge and decide whether the papers are a worthy educational investment.

Here's one.

“Data is stored in two-dimensional tables consisting of columns (fields) and rows (records). Multi-dimensional data is represented by a system of relationships among two-dimensional tables.”

Thursday, May 2, 2024

My April FTD, TLC & SMS LinkedIn Posts

Follow @DBDebunk Follow @ThePostWest

SMS: PRIMARY KEYS & INDEXES

Follow @DBDebunk Follow @ThePostWest

I am working on entirely new papers (not re-writes) in the PRACTICAL DATABASE FOUNDATIONS series. I have already published two:

THE FIRST NORMAL FORM - A DEFINITIVE GUIDE
PRIMARY KEYS - A NEW UNDERSTANDING

available for ordering from the PAPERS page, and two more:

RELATIONAL DATABASE DOMAINS: A DEFINITIVE GUIDE
DATABASE RELATIONS: A DEFINITIVE GUIDE

are in progress and forthcoming, respectively.

In the process I am coming across industry common and entrenched "pearls" that I am using for my "Setting Matters Straight" (SMS) and "To Laugh or Cry" (TLC) posts on Linkedin. I do those posts to enable the few thinking database professionals left realize how scarce foundation knowledge is, and to illustrate fallacies that abound in the industry, of which they are unaware, and which the papers are intended to dispel.

Time permitting, I may expose and dispell some of those fallacies, treated in more depth in the papers, such that those thinking professionals can test their knowledge and decide whether the papers are a worthy educational investment.

Here's one:

“There seams to be some confusion between what a Primary Key is, and what an Index is and how they are used. The Primary Key is a logical object. By that I mean that is simply defines a set of properties on one column or a set of columns to require that the columns which make up the primary key are unique and that none of them are null. Because they are unique and not null, these values (or value if your primary key is a single column) can then be used to identify a single row in the table every time. In most if not all database platforms the Primary Key will have an index created on it. An index on the other hand doesn’t define niqueness. An index is used to more quickly find rows in the table based on the values which are part of the index. When you create an index within the database, you are creating a physical object which is being saved to disk.”

Can you identify the fallacies before you proceed?

Subscribe to: Posts (Atom)

Web Analytics