Sunday, March 26, 2023

RELATIONSHIPS AND THE RDM V2 PART 1: RELATIONS & DATABASE RELATIONS



with David McGoveran

 

Note: This a multi-part re-write of a previous series intended, when completed, to replace it. In the meantime you can consult the old version -- there is nothing wrong with it.

No matter how much I demonstrate the absence of foundation knowledge in the IT industry, the mountain of evidence and reasoning is dismissed -- how can everybody be wrong and only Fabian Pascal be right?

Well, consider the following (from a college prof no less):
“... we are not modeling objects/entities/attribute ... at all in the relational model, [but] a bunch of relationships ... hence perhaps Codd was correct in calling it a "relation", a bunch of relationships ... Interesting that most people think of relationships as being the distinguishing characteristic of a relational model and it is not ... [it] has no relationships since Codd decreed that all relationships must be represented by foreign keys, which are exactly the same as "attributes ... What [other] type(s) of relationships can be explicitly and formally defined in a relational data model? Of course there are many other relationships which can be inferred, such as between an attribute and an entity identifier. Please give me a precise reference to where Codd spoke of relationships [differently than i]n his 1985 piece published in ComputerWorld, [where] he said that the only way to represent a relationship (between relations) was through explicitly stored values (i.e., attributes, foreign keys) ... In my personal understanding, a relation is defined as a set of tuples. Then ... "in the relational model every relation represents a relationship". And then a quote from Chen: "each tuple of entities ... is a relationship". If I use the first and the second statements - I can say that a relationship is a set of tuples. The third statement says that a relationship is a tuple. So far, is a relationship a set of an element of a set? (Or may be a set of sets?) ... I argue that there is essentially no difference between relationships between entities of distinct classes and between properties of the same class. They both represent relationships. A property can represent a relationship between entities of distinct classes. If such relationships are represented by foreign keys and the relations representing the classes must be in 1NF, then relational databases can represent only M:1 relationships, a very unnecessary limitation when modeling some reality of interest ... The entity-relationship model is essentially a directed graph model, where relationships are prominent residents. Not so in the relational model (despite the name), where relationships (between relations, mind you) are not visible and in the SQL implementations is reduced to constraints. Relationships are about structure, which is as important as meaning (the semantics of the terms used in the universe being modeled).” --LinkedIn.com

The amount of nonsense squeezed into this rambling is mind boggling. No understanding of the RDM, confusion, abysmal reasoning and misuse of terms -- debunking it in its entirety would be practically impossible (believe me, I tried). Instead, I focus on a critical aspect of the RDM of which there is little grasp in the industry: I convey the fundamentals and leave it to the motivated reader to try their own debunking -- the most effective way I know to learn.

------------------------------------------------------------------------------------------------------------------

SUPPORT THIS SITE
DBDebunk was maintained and kept free with the proceeds from my @AllAnalitics column. The site was discontinued in 2018. The content here is not available anywhere else, so if you deem it useful, particularly if you are a regular reader, please help upkeep it by purchasing publications, or donating. On-site seminars and consulting are available.Thank you.

LATEST POSTS

03/26 RELATIONSHIPS AND THE RDM V2 PART 1: RELATIONS & DATABASE RELATIONS

03/04 ON NORMALIZATION AND THE SCIENTIFIC METHOD (t&n)

02/04 CONCEPTUAL MODELING, LOGICAL DATABASE DESIGN AND PHYSICAL IMPLEMENTATION (sms)

UPDATES

03/26 Added Modeling of Integrity Constraints Dependencies

03/14 Added Russell’s On Denoting to LINKS page

03/14 Added Russell’s Paradox to LINKS page.

LATEST PUBLICATIONS (order from PAPERS and BOOKS pages)

08/19 Logical Symmetric Access, Data Sub-language, Kinds of Relations, Database Redundancy and Consistency, paper #2 in the new UNDERSTANDING THE REAL RDM series.
02/18 The Key to Relational Keys: A New Understanding, a new edition of paper #4 in the PRACTICAL DATABASE FOUNDATIONS series.
04/17 Interpretation and Representation of Database Relations, paper #1 in the new UNDERSTANDING THE REAL RDM series.
10/16 THE DBDEBUNK GUIDE TO MISCONCEPTIONS ABOUT DATA FUNDAMENTALS, my latest book (reviewed by Craig Mullins, Todd Everett, Toon Koppelaars, Davide Mauri).

USING THIS SITE
- To work around Blogger limitations, the labels are mostly abbreviations or acronyms of the terms listed on the
FUNDAMENTALS page. For detailed instructions on how to understand and use the labels in conjunction with that page, see the ABOUT page. The 2017 and 2016 posts, including earlier posts rewritten in 2017 were relabeled accordingly. As other older posts are rewritten, they will also be relabeled. For all other older posts use Blogger search.
- The links to my AllAnalytics columns no longer work. I re-published only the 2017 columns @dbdebunk, and within them links to sources external to AllAnalytics may or may not work.

SOCIAL MEDIA
I deleted my Facebook account. You can follow me @DBDdebunk on Twitter: will link to new posts to this site, as well as To Laugh or Cry? and What's Wrong with This Picture? posts, and my exchanges on LinkedIn.
-----------------------------------------------------------------------------------------------------------------
-

Relationships and Relations

 In large part reinforced by Chen's E/RM and the variety of its distortions circulating in the industry, the notion that what distinguishes the RDM is "relationships between tables" using FKs is one of the oldest and most common in the industry. 'Relational' is even attributed to those relationships that are, in fact, between database relations (not tables), with which the term actually originates. What is more, many who cling to that notion claim that that's the only type of relationship the RDM can express (do a Web search on 'database' + 'relationships' and see what you get). If you ask those practitioners  exactly what other relationships are there at the conceptual level that the RDM cannot express at the logical level, you will get blank stares, or more babble. The author of the above alleges one such relationship, namely that "between an attribute and an entity identifier" (actually, should be attributes), but are there others? Can you specify them?

Note: Confusion of levels of representation inhibits understanding of the RDM and contaminates the above quote. To avoid conceptual-logical conflation (CLC) we prefer, inasmuch as possible, distinct terminology at each level. The RDM does have a distinct term at the logical level for relationships at the conceptual level, but as we shall see, it's not possible to avoid using 'relationship' at both levels, so herein we prepend the term with 'c-' and 'l-'.

Relations and Database Relations

 

RDM is an adaptation of mathematical relation theory (MRT) to database management.

Note: We prepend domains, relations and tuples with 'database' to distinguish the adapted RDM concepts from their MRT mathematical counterparts from which they derive.

One of the few approximately correct statements in the quote is:

 “...[A] relation is defined as a set of tuples ... in the relational model every [database] relation represents a relationship”.
The first part is correct insofar as it goes, but it leaves out much that is essential, because not every set of tuples qualifies as a relation.

In modern expositions of MRT, a relation is a set of n-tuples that represents an abstract (i.e., uninterpreted) relationship between n sets of elements such that the mth element of each tuple is from mth set. In other words, a relation can be defined as any subset of the Cartesian product (Cp) of n sets, formed either by removing tuples or by uniformly removing one or more elements of all tuples of the Cp.  For example, given the three domains (n = 3):

{1,2}
{x,y}
{A,B}


we obtain the following Cp of the sets, itself a relation:

<1,x,A>*
<1,x,B>
<1,y,A>*
<1,y,B>
<2,x,A>
<2,x,B>
<2,y,A>
<2,y,B>*


and the removal of the tuples not marked (by "*") defines the derived relation:
 
<1,x,A>
<1,y,A>
<2,y,B>


Removing the second element from each tuple in this relation defines yet another derived relation:

<1,A>
<2,B>


For the purposes of this discussion note carefully that in MRT:
  • Tuples and, thus, relations are abstract, mathematically arbitrary (i.e., having no intrinsic meaning) sets of elements;
  • Tuples are ordered sets of elements (order is significant) and are unique by mathematical definition (no duplicates);
  • Domains are implicitly defined by defining (exhibiting) a relation (i.e., as n sets of elements);
  • No concept of attributes and keys and exist.

Note: In MRT sets are abstract and fixed (there is no mechanism for expressing time dependence). The mth domain is just the set of mth values of all the n-tuples in a relation -- the concept of attributes distinct from domains makes no sense. It is the adaptation of the theory to time-varying database relations with adherence to relational algebra (RA) that distinguishes attributes from domains. (see The Interpretation and Representation of Database Relations and Domains: The Database Glue.)

In elementary algebra, abstract symbols and expressions involving them can be assigned interpretations and still be manipulated algebraically (i.e., mechanically, without concern for meaning differences). So can relations and expressions involving them in MRT. As an adaptation of MRT, RDM inherits this feature -- specifically, using simple set theory and most of first order predicate logic (SST/FOPL) -- for database management: it provides a partial interpretation of MRT that enables each relational database to represent some conceptual model of reality without loss of set-mathematical properties. 

Note: The following is very helpful before you proceed:
The Interpretation and Representation of Database Relations
Understanding Conceptual vs. Data Modeling Part 4: Properties-object Modeling

The following table shows the differences between relations and database relations adapted from them.


Note, in particular, that constraints are necessary to ensure that databases are consistent with the conceptual models they are intended to represent. They provide the partial interpretation (i.e., a semantics) for RDM symbols and expressions. There are several types of semantic constraint:

  • Domain constraints;
  • Relation constraints

- attribute constraints;
- tuple constraints;
- multi-tuple constraints;

  • Multi-relation (a.k.a. database) constraints.

Note: Multi-relation constraints include referential constraints that represent inter-relation relationships using FKs,  but here we focus on the other types of relationship. FKs are discussed in Foreign Keys series and Fourth Order Properties: Association Relations vs. Foreign Keys.

RDM is, thus, not "just theory" -- the adaptation from MRT renders it applied theory-- applicable to database management.

As we shall see, some of the features added by the adaptation represent at the logical level  relationships at the conceptual level. Practitioners miss them because, as part of poor grasp of RDM, they are mostly unaware of the adaptation and hence fall prey to the fallacy that the RDM cannot express them.

We now have the fundamentals necessary to tackle relationships in RDM. We will use a simple example that  makes everything clear, so stay tuned for Part 2.

 

 

 

No comments:

Post a Comment

View My Stats