Sunday, May 20, 2018

Relationships and the RDM Part 2: Integrity Constraints


Note: This is the second of a three-part rewrite of two earlier posts, to bring them into line with McGoveran's formalization and re-interpretation of Codd's true RDM[1]. For the in-depth treatment see [2,3].


(Continued from Part 1)
 
In Part 1 I referred to an approach to conceptual modeling[2] inspired by McGoveran's recent work formalizing and re-interpreting Codd's work[1] that, unlike conventional entity-relationship modeling (E/RM)[4], recognizes rather than obscures several types of relationships among (1) individual entity properties, and (2) members of entity groups, subgroups, and multigroups:

  • Individual property relationships;
  • Group member relationships:
- Uniqueness;
- Name-properties relationships;
- Aggregate relationships.
  • Entity Supertype-subtypes relationships;
  • Group relationships:
- Many-to-one (M:1) (of which one-to-one (1:1) are a special case);
- Many-to-many (M:N).

The modeling includes formulation of business rules (BR) that specify the defining properties of the identified objects. In addition to individual property rules, there are three categories of BRs that specify defining properties that arise from relationships:

  • Entity rules;
  • Multi-entity rules;
  • Multi-group rules.

For how to formulate BRs see [2].


Sunday, May 13, 2018

Relationships and the RDM Part 1: Kinds of Relationships


Note: This is a multi-part rewrite of two earlier posts, to bring them into line with McGoveran's formalization and interpretation of Codd's true RDM[1]. For the in-depth treatment see [2,3].

Whenever I demonstrate that data practitioners do not understand data and relational fundamentals, it is not the ample evidence that I present, but my claim that is dismissed as nonsense: how could everybody be wrong and only Fabian Pascal be right?

I dare you, then, to make sense of the following:

“William Kent confesses (in my words) that he can not distinguish between "relationships" and "attributes" ... the later might be completely redundant ... the notion of an attribute presumes a relationship, so we must define that first ... All of this is handled explicitly and correctly in ORM -- we model objects (each one appears only once in a data model diagram) and relationships. There are no attributes ... an attribute is an object playing a role in a relationship with another object.”
“... we are not modeling objects/entities/attribute ... at all in the relational model, [but] a bunch of relationships ... hence perhaps Codd was correct in calling it a "relation", a bunch of relationships ... Interesting that most people think of relationships as being the distinguishing characteristic of a relational model and it is not ... [it] has no relationships since Codd decreed that all relationships must be represented by foreign keys, which are exactly the same as "attributes ...”
“... isn't it funny, that the term relation is implicitly mapped (in our minds) to a table of a database? If (loosely speaking) a relationship in our conventional data modeling is represented by a foreign key in a table (and combining both points together) -- should a table (relation) consists only of foreign keys? ... What [other] type(s) of relationships can be explicitly and formally defined in a relational data model? Of course there are many other relationships which can be inferred, such as between an attribute and an entity identifier. Please give me a precise reference to where Codd spoke of relationships [differently than i]n his 1985 piece published in ComputerWorld, [where] he said that the only way to represent a relationship (between relations) was through explicitly stored values (i.e., attributes, foreign keys).”
“... we are not modeling objects/entities/attribute ... at all in the relational model, [but] a bunch of relationships ... hence perhaps Codd was correct in calling it a "relation", a bunch of relationships.”
“In my personal understanding, a relation is defined as a set of tuples. Then ... "in the relational model every relation represents a relationship". And then a quote from Chen: "each tuple of entities ... is a relationship". If I use the first and the second statements - I can say that a relationship is a set of tuples. The third statement says that a relationship is a tuple. So far, is a relationship a set of an element of a set? (Or may be a set of sets?).”
“I argue that there is essentially no difference between relationships between entity (type tables) and between an entity and its attributes. They both represent relationships between two populations of things. Something is an attribute by virtue of there being a relationship. If relationships are represented by foreign keys and the entity tables must be in 1NF, as in the relational model, then all relationships must be at most Many-to-One (a very unnecessary limitation when modeling some user domain).”
“I argue that there is essentially no difference between relationships between entities of distinct classes and between properties of the same class. They both represent relationships. A property can represent a relationship between entities of distinct classes. If such relationships are represented by foreign keys and the relations representing the classes must be in 1NF, then relational databases can represent only M:1 relationships, a very unnecessary limitation when modeling some reality of interest.”
“The entity-relationship model is essentially a directed graph model, where relationships are prominent residents. Not so in the relational model (despite the name), where relationships (between relations, mind you) are not visible and in the SQL implementations is reduced to constraints. Relationships are about structure, which is as important as meaning (the semantics of the terms used in the universe being modeled).”
There is so much wrong squeezed in these paragraphs that, as we shall see, it takes a several fold longer, multi-part series to debunk, which is really impossible without foundation knowledge. So let's have some first.

Sunday, May 6, 2018

Meaning Criteria and Entity Supertype-Subtypes


Note: This is a re-write of an earlier post to bring it in line with the McGoveran formalization and interpretation[1] of Codd's RDM.

"I have a database for a school ... [with] numerous tables obviously, but consider these:
CONTACT - all contacts (students, faculty): has fields such as LAST, FIRST, ADDR, CITY, STATE, ZIP, EMAIL;
FACULTY - hire info, login/password, foreign key to CONTACT;
STUDENT - medical comments, current grade, foreign key to CONTACT."
"Do you think it is a good idea to have a single table hold such info? Or, would you have had the tables FACULTY and STUDENT store LAST, FIRST, ADDR and other fields? At what point do you denormalize for the sake of being more practical? What would you do when you want to close out one year and start a new year? If you had stand-alone student and faculty tables then you could archive them easily, have a school semester and year attached to them. However, as you go from one year to the next information about a student or faculty may change. Like their address and phone for example. The database model now is not very good because it doesn’t maintain a history. If Student A was in school last year as well but lived somewhere else would you have 2 contact rows? 2 student rows?  Or do you have just one of each and have a change log. Which is best?"
I will ignore the flawed terminology (relations, not tables; attributes, not fields), and why would somebody who does not know past, or new requirements, modeling, and database design, mess with a working database just because "he heard something about (insert your favorite fad here)", and consider the replies.