Sunday, March 6, 2022

TYFK: Relationships -- Uniqueness and Attribute Constraints



Note: Each "Test Your Foundation Knowledge" post presents one or more misconceptions about data fundamentals. To test your knowledge, first try to detect them, then proceed to read our debunking, reflecting the current understanding of the RDM, distinct from whatever has passed for it in the industry to date. If there isn't a match, you can review references -- reflecting the current understanding of the RDM, distinct from whatever has passed for it in the industry to date -- which explain and correct the misconceptions. You can acquire further knowledge by checking out our POSTS, BOOKS, PAPERS, LINKS (or, better, organize one of our on-site SEMINARS, which can be customized to specific needs).

 

“A unique constraint is a type of column restriction within a table, which dictates that all values in that column must be unique [and] allows null values ... a null is the complete absence of a value (not a zero or space). Thus, it is not possible to say that the value in that null field is not unique, as nothing is stored in that field.”
--Techopedia
This is one of my recent "What's Wrong with this database picture" posts on LinkedIn.

Misconceptions

In the RDM a uniqueness constraint:

  • Should not be viewed solely as a "column restriction within a table'.
  • Does NOT allow SQL "NULLs" (not "NULL values"), which have nothing to do with storage.

------------------------------------------------------------------------------------------------------------------

SUPPORT THIS SITE
DBDebunk was maintained and kept free with the proceeds from my @AllAnalitics column. The site was discontinued in 2018. The content here is not available anywhere else, so if you deem it useful, particularly if you are a regular reader, please help upkeep it by purchasing publications, or donating. On-site seminars and consulting are available.Thank you.

LATEST POSTS

02/19 OBG No Understanding without Foundation Knowledge Part 6: Debunking an Online Exchange 5

02/04 OBG No Understanding without Foundation Knowledge Part 5: Debunking an Online Exchange 4

01/30 TYFK Nobody Understands What a Data Model Is

01/21 Read My Lips: If There's NULLs, It's Not Relational

01/16 OBG: No Understanding without Foundation Knowledge Part 4 -- Debunking an Online Exchange 3

LATEST PUBLICATIONS (order from PAPERS and BOOKS pages)
- 08/19 Logical Symmetric Access, Data Sub-language, Kinds of Relations, Database Redundancy and Consistency, paper #2 in the new UNDERSTANDING THE REAL RDM series.
- 02/18 The Key to Relational Keys: A New Understanding, a new edition of paper #4 in the PRACTICAL DATABASE FOUNDATIONS series.
- 04/17 Interpretation and Representation of Database Relations, paper #1 in the new UNDERSTANDING THE REAL RDM series.
- 10/16 THE DBDEBUNK GUIDE TO MISCONCEPTIONS ABOUT DATA FUNDAMENTALS, my latest book (reviewed by Craig Mullins, Todd Everett, Toon Koppelaars, Davide Mauri).

USING THIS SITE
- To work around Blogger limitations, the labels are mostly abbreviations or acronyms of the terms listed on the
FUNDAMENTALS page. For detailed instructions on how to understand and use the labels in conjunction with the that page, see the ABOUT page. The 2017 and 2016 posts, including earlier posts rewritten in 2017 were relabeled accordingly. As other older posts are rewritten, they will also be relabeled. For all other older posts use Blogger search.
- The links to my columns there no longer work. I moved only the 2017 columns to dbdebunk, within which only links to sources external to AllAnalytics may work or not.

SOCIAL MEDIA
I deleted my Facebook account. You can follow me @DBDdebunk on Twitter: will link to new posts to this site, as well as To Laugh or Cry? and What's Wrong with This Picture? posts, and my exchanges on LinkedIn.
------------------------------------------------------------------------------------------------------------------

Debunking

Most practitioners are familiar with referential constraints; the fact that they still believe, wrongly, that they are the origin of 'relational' indicates realization that they represent inter-group relationships. But few of them, if any, are aware of the intra-group relationships supported by the RDM as relational constraints, to the point of claiming, ridiculously, that the RDM "does not support relationships" (if the RDM is about anything, it is about relationships).

A relational database is intended to represent a conceptual model of the real world, specifically a multigroup -- a collection of related entity groups. A database relation is a mathematical relation constrained to represent an entity group, while preserving its mathematical properties. Constraints are, thus, the formal representation at the database logical level of some of the semantics (meaning) of the data expressed as business rules at the conceptual level: if the database is properly designed and constraints are declared to and enforced by the DBMS, the database is semantically consistent (i.e., a faithful representation of the conceptual model -- the multigroup). Some of the constraints represent intra-group and inter-group relationships, which are collective properties of the groups and the multigroup, respectively.

As we explained elsewhere, there are four categories of relational constraints:

  • A domain constraint ensures that a database domain represents a property of one or more various types of entities;
  • An attribute constraint ensures that a database attribute represents a property of entity members of a specific group;
  • A tuple constraint ensures that a database tuple represents a fact about (i.e., individual properties of) an entity member of a group;
  • A multi-tuple (or relation) constraint ensures that a database relation represents a relationship among all entity members of a group that is a collective property of the group;
  • A multi-relation (or database) constraint ensures that the database represents a relationship among the group members of a multigroup and is a collective property of the multigroup.

There are one or more types of constraints in each of the categories and unless (the data in) all relations, individually and collectively, satisfy all the constraints, semantic consistency is not guaranteed.

Practitioners are also familiar with PKs and commonly apply uniqueness constraints because they are supported (albeit often poorly) in SQL, but few of them understand the relational context, which leads to confusion and poor practices. For example, they are not familiar with the categories above, do not know the types of constraints in each and which represent what kind of relationship.

Now that you know the categories, in which would you place uniqueness constraints? Without understanding of the RDM, viewing it solely as a "restriction on an attribute" (not column!), chances are you would consider it an attribute constraint, in which case you would be wrong. Consider the simple relation ORDERS:

ORDERS
========================

O_INV#       O_DATE     O_AMT
-======----------------------


O_INV# is the PK of relation ORDERS and, thus, there is a logical uniqueness constraint on the relation(distinct from the physical index on the attribute with which it is usually implemented). It represents the distinguishability of entities within the group of orders, which -- this escapes to most -- is an intra-group relationship among all orders in the group. The DBMS enforces it by comparing each new tuple with all tuples in ORDERS and rejects it if it duplicates an existing O_INV# value -- hence a multi-tuple constraint.

What, then, is an attribute constraint?

There are many types of entities that, like orders, have properties measured in currency or date (e.g., balance due and card expiration of customers). Assuming that domains $AMOUNT and DATE represent properties "in the abstract", O_AMT and O_DATE attributes are properties in a specific group context, namely, of order entities. 

Note very carefully that while attributes draw their values from domains, they can and usually are subsets of their corresponding domains -- their value ranges are narrower. That is why there are not just domain constraints, but also attribute constraints. 

The difference between a uniqueness and attribute constraint should be obvious, particularly that the former represents an intra-group relationship and the latter does not represent a relationship.

As to NULL, if it means the absence of a value (which is why "NULL value" is a contradiction in terms),
neither uniqueness or non-uniqueness can be determined, which is why SQL NULLs are NOT allowed in keys, be they PKs or CKs. Be that as it may, NULLs violate the RDM -- for a relational solution to missing data without NULLs see The Last NULL in the Coffin.http://dbdebunk.blogspot.com/p/papers_3.html

Further Reading

Relationships and the RDM series

Understanding Relational Constraints

Semantics, Relations and the Missed Link

Levels of Representation Relationships, Rules, Relations and Constraints



No comments:

Post a Comment

View My Stats