Friday, February 14, 2020

TYFK:What Is a Relational Schema?

Note: About TYFK posts (Test Your Foundation Knowledge) see the post insert below.
“A relational database stores information in a structured format called a schema. This schema is defined according the rules of database normalization. These rules are meant to ensure the integrity of the data. The schema for a database is broken up into the objects such as tables and constraints. Tables hold your data and are broken down into rows. each row represents a single entity such as a person and has columns which define the attributes of the entity such as age. Constraints define limitations around the data. For example a check constraint might limit the range of valid dates in a datetime column. From there queries can be run to extract data from the database. These queries will often join multiple tables to pull data from them.”
First try to detect the misconceptions, then check against our debunking. If there isn't a match, you can acquire the necessary foundation knowledge in our POSTS, BOOKS, PAPERS, LINKS or, better, organize one of our on-site SEMINARS, which can be customized to specific needs.


DBDebunk was maintained and kept free with the proceeds from my @AllAnalitics column. The site was discontinued in 2018. The content here is not available anywhere else, so if you deem it useful, particularly if you are a regular reader, please help upkeep it by purchasing publications, or donating. On-site seminars and consulting are available.Thank you.

-12/24/20: Added 2021 to the
POSTS page

-12/26/20: Added “Mathematics, machine learning and Wittgenstein to LINKS page

- 08/19 Logical Symmetric Access, Data Sub-language, Kinds of Relations, Database Redundancy and Consistency, paper #2 in the new UNDERSTANDING THE REAL RDM series.
- 02/18 The Key to Relational Keys: A New Understanding, a new edition of paper #4 in the PRACTICAL DATABASE FOUNDATIONS series.
- 04/17 Interpretation and Representation of Database Relations, paper #1 in the new UNDERSTANDING THE REAL RDM series.
- 10/16 THE DBDEBUNK GUIDE TO MISCONCEPTIONS ABOUT DATA FUNDAMENTALS, my latest book (reviewed by Craig Mullins, Todd Everett, Toon Koppelaars, Davide Mauri).

- To work around Blogger limitations, the labels are mostly abbreviations or acronyms of the terms listed on the
FUNDAMENTALS page. For detailed instructions on how to understand and use the labels in conjunction with the that page, see the ABOUT page. The 2017 and 2016 posts, including earlier posts rewritten in 2017 were relabeled accordingly. As other older posts are rewritten, they will also be relabeled. For all other older posts use Blogger search.
- The links to my columns there no longer work. I moved only the 2017 columns to dbdebunk, within which only links to sources external to AllAnalytics may work or not.

I deleted my Facebook account. You can follow me:
- @DBDdebunk on Twitter: will link to new posts to this site, as well as To Laugh or Cry? and What's Wrong with This Picture? posts, and my exchanges on LinkedIn.
- The PostWest blog for monthly samples of global Antisemitism – the only universally acceptable hatred left – as the (traditional) response to the existential crisis of decadence and decline of Western  civilization (including the US).
- @ThePostWest on Twitter where I comment on global #Antisemitism/#AntiZionism and the Arab-Israeli conflict.


The Misconceptions

  • The data in a relational database is not physically stored in a schema -- a schema does not contain data.
  • Explicit normalization (to 1NF) and further normalization (to 5NF) (which should not be confused) are necessary only to repair incorrectly designed databases, and data integrity -- as the term commonly understood -- is not ensured by the rules thereof.
  • Tables, rows, and columns are not elements of a schema.
  • SQL check constraints are only one type of constraints, are not the intended relational constraints, and SQL does not support all of those.
  • Join is but one operation of the relational algebra (RA) that must be supported by a RDBMS.
  • "Define limitations around the data" is an opaque description of what constraints do.

The Correct Answer

  • A relational schema is a formal logical representation of a conceptual model of reality consisting of a collection of related entity groups (multigroup) -- each represented by database relations (not to be confused with mathematical relations) -- and relationships within and among the groups -- represented by semantic constraints[1]. Every database relation and the database as a whole are constrained to be consistent with the within- and among-groups relationships in the conceptual model[2]. Relations are sets of tuples -- representing facts about entities -- and have attributes -- representing properties in context -- defined on simple domains[3] -- representing properties[4].
  • The schema (not to be confused with the graphical visualization thereof on some medium, which may not display all schema elements) does not contain user data. A relational database consists of the schema recorded in the catalog -- itself a system relational database of metadata -- and the data (tuples) recorded in the user database, both physically stored in any of a number of ways, independently of the logical schema[5].
  • Relations can be visualized on some medium as R-tables, in which case tuples display as rows and attributes as columns, but these play no relational role[4].
  • According to a new understanding, database relations are normalized (in 1NF) and fully normalized (in 5NF) by definition[6], otherwise they are not relations and the database is not relational[7].
  • Database design must adhere to three fundamental principles:
The Principle of Expressive Completeness (POEC);
The Principle of Representational Parsimony (PORP);
The Principle of Orthogonal Design (POOD);
which jointly imply the Principle of Full Normalization (POFN), but not vice-versa[8]. This means that adherence to the three core principles implicity produces relational databases (i.e., consisting of relations) -- explicit application of the "rules of normalization" is necessary only to repair incorrectly designed databases due to failure to adhere to the three principles[9].

Note: I will not publish or respond to anonymous comments. If you have something to say, stand behind it. Otherwise don't bother, it'll be ignored.


[1] Pascal, F., Conceptual Modeling for Database Design: Formalizing the Informal

[2] Pascal, F., Relationships and the RDM series

[3] Pascal, F., Simple Domains and Value Atomicity

[4] Pascal, F., Understanding Relations series

[5] Pascal, F., Physical Independence series

[6] Pascal, F., Normalization and Further Normalization series

[7] Pascal, F., What Relations Really Are and Why They Are Important

[8] Pascal, F., Database Design What It Is and Isn't

[9] Pascal, F., The Costly Illusion: Normalization, Integrity, and Performance

No comments:

Post a Comment

View My Stats