Sunday, April 30, 2023

RELATIONSHIPS AND THE RDM V2 Part 3: SEMANTIC CONSTRAINTS



Note: This is a multipart re-write of a previous series that, when completed, is intended to replace it.

In Part 1 we documented the differences between mathematical and database relations (see table in Part 1). We attributed the fallacy that the RDM can express only one type of relationship -- between relations using FKs -- to the industry being unaware of the adaptation of math relations for database management. We intimated that some of the additional features of database relations express relationships other than between relations.

In Part 2 we identified the intra-group c-relationships (and the corresponding within-relation l-relationships) in our approach to conceptual modeling:

  • Properties-entities relationships

- general dependencies

  • Properties Relationships
  • Entities Relationships

- entity uniqueness
- functional dependencies (FD)
- entity supertype-subtypes relationships

and used a simple conceptual model (CM) of six entity groups to illustrate them:

Customers (cID, cname, FICO, discount)
Products (pID, pname, price)
Salesmen (sID, sname, sales, salary, commission)
Orders (oID, pID, cID, sID, date, amount)
Order Items (oID, iID, pID, quantity)

Database design is the use of a data model (DM) (here, RDM) to formalize conceptual models (CM) -- including c-relationships -- as logical models (LM) for database representation, so it must be able to convert the business rules (BR) that express those relationships in specialized natural language at the conceptual level to formal constraints in a FOPL-based data sublanguage at the logical level.

Our intention is to demonstrate that the RDM can express all c-these relationships, but we face a difficulty.

------------------------------------------------------------------------------------------------------------------

SUPPORT THIS SITE
DBDebunk was maintained and kept free with the proceeds from my @AllAnalitics column. The site was discontinued in 2018. The content here is not available anywhere else, so if you deem it useful, particularly if you are a regular reader, please help upkeep it by purchasing publications, or donating. On-site seminars and consulting are available.Thank you.

LATEST POSTS

04/23 THE DENORMALIZATION ILLUSION (t&n)

04/16 RELATIONSHIPS & THE RDM V2 PART 2: INTRA-GROUP RELATIONSHIPS

04/08 MISSING DATA: RDM VS SQL -- A REAL WORLD COMPARISON (t&n)

UPDATES

04/23 Added The Story of Mathematical Proof to LINKS page

04/03 Added First OrderLogic to LINKS page

04/03 Added Mathematical Logic - Reasoning in First Order Logic to LINKS page

03/26 Added Modeling of Integrity Constraints Dependencies to LINKS page

03/14 Added Russell’s On Denoting to LINKS page

03/14 Added Russell’s Paradox to LINKS page.

LATEST PUBLICATIONS (order from PAPERS and BOOKS pages)

08/19 Logical Symmetric Access, Data Sub-language, Kinds of Relations, Database Redundancy and Consistency, paper #2 in the new UNDERSTANDING THE REAL RDM series.
02/18 The Key to Relational Keys: A New Understanding, a new edition of paper #4 in the PRACTICAL DATABASE FOUNDATIONS series.
04/17 Interpretation and Representation of Database Relations, paper #1 in the new UNDERSTANDING THE REAL RDM series.
10/16 THE DBDEBUNK GUIDE TO MISCONCEPTIONS ABOUT DATA FUNDAMENTALS, my latest book (reviewed by Craig Mullins, Todd Everett, Toon Koppelaars, Davide Mauri).

USING THIS SITE
- To work around Blogger limitations, the labels are mostly abbreviations or acronyms of the terms listed on the
FUNDAMENTALS page. For detailed instructions on how to understand and use the labels in conjunction with that page, see the ABOUT page. The 2017 and 2016 posts, including earlier posts rewritten in 2017 were relabeled accordingly. As other older posts are rewritten, they will also be relabeled. For all other older posts use Blogger search.
- The links to my AllAnalytics columns no longer work. I re-published only the 2017 columns @dbdebunk, and within them links to sources external to AllAnalytics may or may not work.

SOCIAL MEDIA
I deleted my Facebook account. You can follow me @DBDdebunk on Twitter: will link to new posts to this site, as well as To Laugh or Cry? and What's Wrong with This Picture? posts, and my exchanges on LinkedIn.
------------------------------------------------------------------------------------------------------------------

Semantic Constraints

We have seen in Part 1 that as part of the RDM adaptation of math relations for database management, individual database relations and the database as a whole are constrained for semantic consistency with the conceptual model they are intended to represent, which includes c-relationships. We identified several categories of semantic constraints in RDM for that purpose:

  • Domain constraints;
  • Relation constraints:

- attribute constraints;
- tuple constraints;
- multituple constraints.

excluding multirelation (aka database) constraints for reasons already explained.

Domain and attribute constraints constrain relations for consistency with properties. We have intimated that tuple and multituple constraints ensure consistency with intra-group c-relationships. So all we have to do is formalize BRs that express the relationships using a FOPL-based relational data sublanguage. And herein lies the difficulty: the industry has not produced such a language. Misconceptions notwithstanding, SQL:

  • Is not truly/fully relational;
  • Is not just the data sublanguage that it ought to be;
  • Is a poorly designed language;
  • It is not well-defined, especially wrt constraint expression.

and it would be limiting and possibly misleading to use for learning purposes. 

About the only option is to use FOPL directly, but it is inaccessible to most practitioners and they would not be able to use it in practice anyway, which will only invite more claims of "theory and, therefore, not practical". This would be, of course, fallacious -- it's precisely because constraints are expressible in FOPL that the theory (RDM) is practical -- it just needs a language that is adapted for database management, which precisely what a relational data sublanguage is supposed to be. Be that as it may, what to do in its absence? We will attempt to express all the constraints in FOPL in the forthcoming re-write of our paper, Conceptual Modeling for Relational Database Design. What we can do within the limits of this post, is illustrate multituple constraints, by taking advantage of a neat constraint simplification in RDM.

Multituple Constraints

We have identified in Part 2 three types of intra-group c-relationship that must be formalized as constraints:

  • General dependencies (GD) of every 1OP on the name;
  • Name uniqueness; and
  • Functional dependencies (FD) of every 1OP on the name.

where name stands-in for entities. Because name uniqueness and FDs are relationships between all entities of a group, they are collective 3OPs of the groups that formalize as multituple constraints. 1OPs are represented by attributes and name by a PK.

A database relation represents a group of entities of a single type and is by definition and design in 5NF. The only dependencies that hold in it are FDs of non-key attributes on the PK. So, for the PRODUCTS relation in our running example, the FDs would be expressed as follows in FOPL:

∀pID ∀pname ∀price ∀pname ′∀price′ PRODUCTS (pID,price,pname) ∧ PRODUCTS(pid,pname′,price′) → pname = pname′ ∧ price = price′
which in "English" reads:
Forall pIDs forall pnames forall prices forall pname′s forall price′s pname and price are functionally dependent on pID and pname′ and price′ are functionally dependent on pID implies pname = pname′ and price = price′.
which a data sublanguage would have to express and probably simplify in some specific syntax.

But we have seen that in a group of entities of a single type, the conjunction of name uniqueness and general dependencies of defining 1OPs on the name is equivalent to FDs of defining 1OPs on the name. Since the PK is mandatory in a database relation, the mandate satisfies the simple dependencies, so the only constraint that needs to be explicitly declared is the name uniqueness constraint! This constraint can be expressed declaratively similarly to just like SQL does with the PRIMARY KEY clause in the CREATE TABLE statement. Compare this to denormalized relations, which "bundle" multiple entity types and add dependencies for which complex constraints must be declared explicitly and enforced (see FOL Modeling of Integrity Constraints (Dependencies).

We started with the quotes in Part 1 that show that among the many misconceptions about the RDM, one is that the only relationships it can express is among entity members of different groups (i.e., those represented in relational databases by FK (referential) constraints). We have shown here that not only it supports at least three intra-group relationships, but it does so having to declare and enforce only one of them explicitly. But in the absence of foundation knowledge, the industry is chockful of fallacies and misconceptions.

For the rest of the relationships stay tuned for the re-write of the paper Conceptual Modeling for Relational Database Design.








No comments:

Post a Comment

View My Stats