Sunday, April 21, 2013

Un-muddling Modeling: Relations & Relationships Part II

This is a 6/21/17 rewrite of a 4/21/13 post, to bring it in line with McGoveran's interpretation of the true RDM envisioned by Codd[1]. It is the second part of a debunking of a LinkedIn thread (the first part of which was debunked two weeks ago).

Here's what's wrong with last week's picture, namely:

"A conceptual model has no rigorous definition? It is like a sketch of a picture yet to be completed? Or like an outline to a paper to be written or fleshed out? And once the model is rigorously defined, the ad hoc, informal model must be precisely consistent with the underlying model in all its semantics. Are you suggesting that a conceptual model is a precursor to a defined logical (relational) model? Then after the relational model is defined, the conceptual model needs to be a consistent abstraction of the formal logical model. 

What other type(s) of relationships can be explicitly and formally defined in a relational data model? Of course there are many other relationships which can be inferred, such as between an attribute and an entity identifier. Please give me a precise reference to where Codd spoke of relationships [differently than i]n his 1985 piece published in ComputerWorld, [where] he said that the only way to represent a relationship (between entity tables or relations) was through explicitly stored values (i.e., attributes, foreign keys).

What do you mean "Attributes are subsets of domains"? An attribute only exists in the context of a relationship. Something (a domain) is a descriptor of (i.e., is related to) something else (another domain).

What is an "R-table"? What do you mean by a "PICTURE [of a relation]"? There are things and there are views or manifestations/presentations of things. There is the model, and there are various presentations of that model. Is that what you are getting at?"--Gordon Everest,

"Do you mean that...relations are defined over types (also known as domains); a type is basically a conceptual pool of values from which actual attributes in actual relations take their actual values. (taken from the SQL AND RELATIONAL THEORY [2009] by Chris Date). I am also not sure about "pointers". Can I define a domain of pointers? There might be an interesting relation over such domain.In addition, what will happen if I define a relation over a set of types, each of which is (another) relation? Lets say that a relation is either defined over types (domains), or defined over a "heading" (or a "definition") of other relations ... and I also try to eliminate identifiers completely". --AT,

Nothing But Relationships

A conceptual model consists of business rules expressed in natural language in real world informal terms--objects, properties, object groups. While it must be developed as systematically, completely and consistently as possible, it is informal and driven by pragmatic perceptions of reality--it is not rigorous in the formal sense. That is why its database representation requires formalization as a logical model, to which it is, indeed, a precursor. So it's the other way around: the logical model must be a consistent abstraction of the conceptual model, which is the interpretation--meaning--of the logical model. 

The misconception that the RDM represents relationships of only one type--referential--most likely originates with the E/R conceptual modeling approach. It assumes an "absolute" distinction between entities (objects) and relationships. The distinction, however, is in the "eye of the modeler": objects, properties and object groups are all, in fact, relationships labeled differently as a matter of subjective, pragmatic convenience.  All those relationships expressed as business rules comprising a conceptual model are expressible in a relationally complete FOPL-based data language as integrity constraints enforcible by a RDBMS for consistency with the rules. That neither SQL, nor any other current data languages can express--nor can the DBMSs based on it enforce--all of them is the deficiency of their implementation, not a RDM weakness. 

  • (Facts about) objects are (1) relationships among property values, represented by tuples;
  • Object groups are relationships (2) among properties and (3) among objects within the groups, represented by tuple and multi-tuple constraints, respectively;
  • A group of group objects is a set of (4) relationships between members of distinct groups, represented in the database by database (multi-relation) constraints.
Referential (FK) constraints are just one type of database constraints that represent cross-group relationships in relational databases. That neither SQL, nor any of the current non-relational data languages go beyond PK and FK constraints is their own deficiency, not a RDM weakness.

Note: As I already explained, (2), (3) and (4) relationships give rise to second, third and fourth order properties[2].

The above interpretation of Codd is an example of logical-physical confusion (LPC). He "decreed", in fact, the exact opposite: the Information Principle--his rule 0--mandates that all information in a relational database--including relationships between relations--be represented not physically, by pointer paths between records, but logically, by values. But it's not the FK the represents the cross-group relationship--which is an attribute representing a property in context--it's the referential constraint that constrains the FK values in the referencing relation to match the values of the PK in the referenced relation. 

Attributes Are Constrained Domains

A mathematical relation is an abstraction devoid of any real world meaning--it is fixed in time and can have arbitrary values. A database relation is adapted to represent an object group in the real world. Its data are constrained by the real world context, one component of which is time. Database relations are time-varying, i.e., they represent object groups at various points in time--we distinguish between relation type and time-specific relation instances of that type at different times. Domains represent all the possible (valid) values consistent with the properties they represent, attributes represent constrained domains--by time and other context factors. 

Relations and R-tables

A relation is a set of logical relationships among attributes and among tuples-- all the relationships that a relational database represents must be satisfied at all times. It can be visualized on some physical medium--screen, or paper--as a R-table: tuples display as rows, attributes as columns. But the arrangement of rows and columns on the medium (e.g., their order) is insignificant--i.e., not part of the relation specification. A R-table obeys the mathematical set discipline of the relation it displays: unique, unordered rows without missing values and uniquely named unordered columns.

Domains Vs. Data Types

Date and Darwen equate domains with programming data types, but we follow Codd in considering them distinct[3]. Domains (1) represent real world properties (2) are user-constrained types to be consistent with the properties and (3) database objects under DBMS control, while data types do not have to and are application objects under programmer control.

Domains can be relation-valued (RVD), but are not relations. As we have repeatedly explained, non-simple domains, including RVDs, require data languages based on higher logic than first order predicate logic (FOPL), which would defeat core advantages of the RDM: declarativity, decidability and physical independence (PI). One of the defining properties of the set of members of an object group as a whole--distinguishability--arises from a relationship among all the members. It is represented in the database by a uniqueness constraint on the PK attribute(s), which represents the object identifier that AT "intends to eliminate". So, on the one hand the RDM is, erroneously, criticized for inability to express relationships, while on the other hand the means by which relationships that are not only expressible, but actually mandatory, are "eliminated". That's lack of foundation knowledge for you.


[1] McGoveran, D., LOGIC FOR SERIOUS DATABASE FOLK, forthcoming.

[2] Pascal, F., What Meaning Means: Business Rules, Predicates, Constraints, Integrity Constraints and Database Consistency.


Do you like this post? Please link back to this article by copying one of the codes below.

URL: HTML link code: BB (forum) link code:

No comments:

Post a Comment