Saturday, October 9, 2021

Relational and Referential Integrity

“Relational Data Integrity is like every other integrity constraint that checks that the relationships created between data using foreign keys has a consistency. This can be done by using ON UPDATE, ON DELETE constraints on the table.”
I recently quoted this as one of my To Laugh or Cry? items on LinkedIn, which initiated an exchange triggered by the following question:
“You have a better definition? What is it?”
In the exchange the asker's interpretation seemed to be "referential constraints are constraints like any other constraints, so there is no problem".  It is hard to recognize misconceptions without proper understanding of the RDM. We ignore that the above is not really a definition and focus on debunking.

Decades ago I wrote an article in DATABASE PROGRAMMING AND DESIGN carrying the double-meaning title Integrity Is Not Only Referential, in which I debunked Borland's claim that its Paradox file manager supported referential integrity (at the time no PC product did). As one component of the RDM, database integrity is, of course, a DBMS function, but Pradox relegated it to applications. Then, as now, one of the most common and entrenched misconceptions was that relational comes from "relationships between tables" and so relational integrity amounts to referential integrity (RI). RI is, of course, but one of several components that comprise relational integrity -- it is necessary, but insufficient. While practitioners are familiar with referential and PK constraints, if asked what other constraints comprise relational integrity  very few know. Having enumerated them recently on LinkedIn, I asked this very question:
“... what other RELATIONAL constraints ARE there and what is their purpose? I recently posted a weekly truth and other items here that answer it.”
which went unanswered.

Data integrity is one of the three components of the RDM, together with data structure and manipulation. It consists of several categories of constraints which I detailed more than once, most recently in Understanding Relational Constraints, to which I referred the asker (can you give an example for each category?) Defining relational integrity means specifying all the constraint categories required by the RDM.

Consider now the above paragraph: it purports to define relational integrity, but it specifies functionality of referential integrity -- implying the old misconception I wrote about decades ago. The asker did not seem to comprehend the distinction:
“I can't see a problem here. Isn't it simply as follows? ... A *referential integrity constraint* ensures consistency between attributes of different entities - e.g. between primary and foreign keys of related entities (aka relational integrity). Isn't that what the definition says?"
Yes, it is the definition of referential integrity, but not of relational integrity -- there is more to the latter than the former. No matter in how many ways I tried to explain this, I was unable to convey it, because it's practically impossible in the absence of sufficient knowledge and understanding of the RDM.





No comments:

Post a Comment

View My Stats