Thursday, August 5, 2021

TYFK: Facts, Properties, Relationships, Domains, Relations, Tuples

Note: Each "Test Your Foundation Knowledge" post presents one or more misconceptions about data fundamentals. To test your knowledge, first try to detect them, then proceed to read our debunking, reflecting the current understanding of the RDM, distinct from whatever has passed for it in the industry to date. If there isn't a match, you can review references -- reflecting the current understanding of the RDM, distinct from whatever has passed for it in the industry to date -- which explain and correct the misconceptions. You can acquire further knowledge by checking out our POSTS, BOOKS, PAPERS, LINKS (or, better, organize one of our on-site SEMINARS, which can be customized to specific needs).

A statement from a 1986 book that "Data are facts represented by values -- numbers, character strings, or symbols -- which carry meaning in a certain context" triggered the following response on Linkedin:
“...In contrast, Date and Darwen (2000) say:
  • Domains are the things that we can talk about.
  • Relations are the truths we utter about those things.
Thus, the declarative sentence "Fred is in the kitchen." is a fact that links the domains Person[s] and Place[s] with the predicate "is in". The complete relation might be made up of three facts:
  • Fred is in the kitchen.
  • Mary is in the garden.
  • Arthur is in the garden.
This seems to be more precise than the 1986  statement.”
To which the book author responded:
“...back then we did not have the refinement, clarity, nor precision from people like Sjir Nijssen and Terry Halpin regarding facts, or elementary fact sentences, which today you and I know are the bedrock of data modeling. Facts are expressed in sentences (with domains and predicates).”

Unfortunately none of this is sufficiently clear and precise to prevent confusion and it inhibits  understanding of the RDM.


DBDebunk was maintained and kept free with the proceeds from my @AllAnalitics column. The site was discontinued in 2018. The content here is not available anywhere else, so if you deem it useful, particularly if you are a regular reader, please help upkeep it by purchasing publications, or donating. On-site seminars and consulting are available.Thank you.

07/22 Documents and Databases

07/10 Relational Misconceptions Part 2: RDM is Applied Theory

07/01 OBG:Experimental Science and Database Design

- 08/19 Logical Symmetric Access, Data Sub-language, Kinds of Relations, Database Redundancy and Consistency, paper #2 in the new UNDERSTANDING THE REAL RDM series.
- 02/18 The Key to Relational Keys: A New Understanding, a new edition of paper #4 in the PRACTICAL DATABASE FOUNDATIONS series.
- 04/17 Interpretation and Representation of Database Relations, paper #1 in the new UNDERSTANDING THE REAL RDM series.
- 10/16 THE DBDEBUNK GUIDE TO MISCONCEPTIONS ABOUT DATA FUNDAMENTALS, my latest book (reviewed by Craig Mullins, Todd Everett, Toon Koppelaars, Davide Mauri).

- To work around Blogger limitations, the labels are mostly abbreviations or acronyms of the terms listed on the
FUNDAMENTALS page. For detailed instructions on how to understand and use the labels in conjunction with the that page, see the ABOUT page. The 2017 and 2016 posts, including earlier posts rewritten in 2017 were relabeled accordingly. As other older posts are rewritten, they will also be relabeled. For all other older posts use Blogger search.
- The links to my columns there no longer work. I moved only the 2017 columns to dbdebunk, within which only links to sources external to AllAnalytics may work or not.

I deleted my Facebook account. You can follow me:
- @DBDdebunk on Twitter: will link to new posts to this site, as well as To Laugh or Cry? and What's Wrong with This Picture? posts, and my exchanges on LinkedIn.
- The PostWest blog for monthly samples of global Antisemitism – the only universally acceptable hatred left – as the (traditional) response to the existential crisis of decadence and decline of Western  civilization (including the US).
- @ThePostWest on Twitter where I comment on global #Antisemitism/#AntiZionism and the Arab-Israeli conflict.


Propositions and Predicates

A fact is a proposition -- a statement that is unequivocally either true or false. It has an object as a subject (an entity) with properties -- identifying (assigned names) and descriptive -- and specifies the property values for a specific entity. In our example:
Person named Fred is located in the kitchen.
where person is an entity, name and location are properties, 'Fred' and 'in kitchen' being values thereof.

Note very carefully, however, that an object is what, for convenience, we refer to a combination of observable properties (i.e., name + location = person).

Entities that share the same properties are of the same type, by virtue of which they form a group. The sharing -- a relationship among properties -- is itself a property of the group as a whole, expressed as a predicate -- a "group form" of propositions:
Person named (name) is located in (location).
that specifies the relationship as a property of the group. We thus distinguish between properties of individual entities (primitive objects) and a collective property of the group thereof (compound object) -- the relationship among those properties. When property values for specific entities

are "plugged" into the predicate, it instantiates (reduces to) the corresponding facts.


Note that we expressed propositions and predicates in natural language. If we computerized them as is, we would produce a textbase, not a database, that would not be very useful for making inferences (querying). Moreover, computers do not understand what properties, entities and relationships are -- they can only manipulate abstract symbols mathematically. We therefore formalize natural language expressions for database representation (i.e., express them in a symbolic formal data sublanguage).

Domains and Relations

For relational databases we use a data sublanguage based on RDM (SST/FOPL adapted and applied to database management). Expressions in natural language understood semantically by users formalize symbolically via a relational data sublanguage "understood" algorithmically by a DBMS (domains/attributes, tuples, constraints). In RDM properties formalize as domains (sets of values) which, when applied to specific entity groups, formalize as attributes (i.e., attributes represent properties in specific group contexts); propositions formalize as tuples (sets of attribute values), a set of which form a relation. It follows that the relationship among properties formalizes as a relation on the domains, namely a subset of their cross-product. Each relation is associated with a predicate that expresses the relationship and is, therefore, its real world meaning. Thus, the relation

is a relationship between two domains NAME, LOCATION (representing properties), where P_NAME and P_LOCATION are attribute drawing their values from the domains. Its tuples  (sets of attribute values) represent propositions (facts) about entities that form the group. We can visualize a relation PERSONS on the screen as a R-table:

 Fred      kitchen
 Mary     garden
 Arthur   garden

only the body of which (the tuples) is data, the table header is metadata (symbols), but the picture of a relation should not be confused with the relation itself.

Note: Because the meaning of symbols used are often understood by users (e.g., NAME, LOCATION) they often miss that they mean nothing to a DBMS. They look at the above R-table and see persons with properties, while the DBMS sees abstract sets of X and Y values.

Let's now consider the LinkedIn exchange.

Are, as the book claims, data "facts represented by values -- numbers, character strings, or symbols -- which carry meaning in a certain context"? As we have seen,  data (tuples) are not facts, they are values  that represent facts (i.e., propositions) and carry the meaning of the facts in group contexts.

D&D's quotes are somewhat vague.

  • “Domains are the things that we can talk about.”: Domains represent properties. Strictly speaking, entities are the "things  we talk about", but entities are just named combinations of properties, so in that sense we can say that attributes defined on domains represent properties in group contexts that are the things we talk about.
  • “Relations are the truths we utter about those things.”: Relations are sets of tuples that represent  facts about a group's entities -- the uttered truths -- that comprise a relationship among properties as a collective group property.
  • “Thus, the declarative sentence "Fred is in the kitchen" is a fact that links the domains Person[s] and Place[s] with the predicate "is in"”: It is one of the three facts that jointly link the properties Name[s] and Place[s] represented by domains -- a relationship expressible as a predicate.

As to the author's reply that "Facts are expressed in sentences (with domains and predicates) ...", as we have seen, facts (D&D's utterances) are propositions that specify property values of individual entities and are instantiations of -- not "expressed with" -- predicates expressing relationships among those properties and expressed with attributes defined on domains.





No comments:

Post a Comment

View My Stats