Thursday, January 30, 2020

TYFK: What Is a Relational Database?



“RDBMS stands for Relational Database Management System. RDBMS is the basis for SQL, and for all modern database systems like MS SQL Server, IBM DB2, Oracle, MySQL, and Microsoft Access. RDBMS store the data into collection of tables, which might be related by common fields (database table columns). RDBMS also provide relational operators to manipulate the data stored into the database tables. An important feature of RDBMS is that a single database can be spread across several tables. This differs from flat-file databases, in which each database is self-contained in a single table. The most popular data model in DBMS is the Relational Model. It is more scientific a model than others. This model is based on first-order predicate logic and defines a table as an n-ary relation. The main highlights of this model are:
  • Data is stored in tables called relations.
  • Relations can be normalized.
  • In normalized relations, values saved are atomic values.
  • Each row in a relation contains a unique value.
  • Each column in a relation contains values from a same domain.”

The question got 18 answers online, but none came even close to being correct. This is the only one that merits debunking -- the rest will be posted on LinkedIn as "To laugh or cry?".


Note: While the question is about database, due to routine interchangeable use of database and DBMS, we suspect the intention was DBMS. Our debunking applies to database, and our correct answer makes the proper distinction.

First try to detect the misconceptions, then check against our debunking. If there isn't a match, you can acquire the necessary foundation knowledge in our posts, BOOKS, PAPERS or, better, organize one of our on-site SEMINARS, which can be customized to specific needs.



------------------------------------------------------------------------------------------------------------------
SUPPORT THIS SITE

Up to 2018, DBDebunk was maintained and kept free with the proceeds from my @AllAnalitics column. In 2018 that website was discontinued. The content of this site is not available anywhere else, so if you deem it useful, particularly if you are a regular reader, please help upkeep it by purchasing publications, or donating. Thank you.

DATA FUNDAMENTALS 
 
The industry is chockful of misconceptions due to lack of foundation knowledge. Corrections them are dismissed as "theory that is not practical", misinterpreted as "ad-hominem attacks", or ignored altogether, regardless of the amount and quality of reasoning and supporting evidence. Most practitioners -- be it user or vendor personnel -- cannot discern fallacies and do not realize the practical implications thereof and, thus, cannot associate problems with their real causes., hence the industry's "cookbook approach" and succession of fads.
What about you? Are you just a practitioner, or a thinking professional? 
TYFK (Test Your Foundation Knowledge) posts will each present and debunk a pronouncement containing one or more misconceptions. First try to detect them, then check against our debunking. If there isn't a match, you can acquire the necessary foundation knowledge in our POSTS, BOOKS, PAPERS, LINKS or, better, organize one of our on-site SEMINARS, which can be customized to specific needs.

NEW
  • 01/14/20 Updated the LINKS page
  • 01/04/20 Updated the POSTS page with the 2020 posts
  • 12/08/19 Added two educational references on set theory to the LINKS page.

LATEST PUBLICATIONS (order PAPERS and BOOKS)

USING THIS SITE
  • To work around Blogger limitations, the labels are mostly abbreviations or acronyms of the terms listed on the FUNDAMENTALS page. For detailed instructions on how to understand and use the labels in conjunction with the that page, see the ABOUT page. The 2017 and 2016 posts, including earlier posts rewritten in 2017 were relabeled accordingly. As other older posts are rewritten, they will also be relabeled. For all other older posts use Blogger search. 
  • Following the discontinuation of AllAnalytics site, the links to my columns there no longer work. I moved only the 2017 columns to dbdebunk, within which only links to sources external to AllAnalytics may work or not.

SOCIAL MEDIA  

I deleted my Facebook account. You can follow me:
  • @DBDdebunk on Twitter: will link to new posts to this site, as well as To Laugh or Cry? and What's Wrong with This Picture? posts, and my exchanges on LinkedIn.
  • @The PostWest blog: Evidence for Antisemitism/AntiZionism – the only universally acceptable hatred – as the (traditional) response to the existential crisis of decadence and decline of Western (including the US)
  • @ThePostWest Twitter page where I comment on global #Antisemitism/#AntiZionism and the Arab-Israeli conflict.

------------------------------------------------------------------------------------------------------------------

The Misconceptions


  • While SQL was initially intended as a relational data sublanguage, it is riddled with violations of the RDM, it is not just a data sublanguage, and is a poorly designed language. SQL DBMSs are not, therefore, RDBMSs.
  • True relational databases consist of relations -- a special kind of mathematical set adapted for and applied to database management; database relations can be visualized on physical media as R-tables (tuples and attributes as rows and columns) -- not to be confused with SQL tables -- that play no part in the RDM. Neither RDBMSs, nor SQL DBMSs store data as tables.
  • Database relations are logical model objects, "flat files" are physical implementation objects. Using them interchangeably confuses levels of representation referred to as logical-physical confusion (LPC). The RDM supports physical independence (PI): a database relation can be implemented as one or more multiple file(s), multiple relations can be implemented as one file.
  • Relational databases represent several types of real world relationships as constraints on single and multiple relations. The conventional wisdom to date was that one type -- M:1 relationships among entity groups -- are represented by referential constraints (not "common columns"), while M:N relationships are represented by association relations and referential constraints on both referencing and referenced relations. According to a new understanding of the RDM, all such relationships -- both M:1 and M:N, 1:1 being just a special case thereof) -- are represented by association relations and corresponding referential constraints.
  • The RDM is an adaptation of simple set theory (SST) expressible in first order predicate logic (FOPL) applied to database management, and it is this formal theoretical foundation that confers scientific status. Among multiple practical advantages is system-guaranteed logical validity and by-design semantic consistency.
  • The data model concept is poorly understood in the industry. The RDM is the only formally well defined and complete data model, but it is routinely confused with conceptual, logical, and physical models, all of which are referred to as "data models". When practitioners refer to RDM, they mean SQL, which has no formally well-defined, complete, or relational data model.
  • As already mentioned, the RDM does not "define a table as a n-ary relation". According to the new understanding of the RDM database relations are both normalized (in 1NF) and fully normalized (in 5NF) by definition, otherwise they are not relations. For all the relational advantages to materialize database design must adhere to three formal design principles that jointly imply full normalization. Explicit normalization and full normalization are necessary only to repair poor designs due to failure to adhere to these principles.
  • 1NF means that relations have attributes defined on simple domains, the values of which are treated by a relational data sublanguage as atomic. Each attribute is a simple function (1:1 mapping) of the domain.
  • Each tuple of a relation is a set of values -- each drawn from a domain -- at least one of which is a PK value that uniquely identifies the tuple within the relation.


The Correct Answer


A relational database is one designed to adhere to the three core design principles required by the RDM [1] that jointly imply full normalization [2]. 


For relational advantages -- foremost among them correctness (system-guaranteed logical validity and by-design semantic consistency) [3] to materialize, relational databases must be managed by a RDBMS that implements correctly and fully the RDM introduced by Codd as re-interpreted and formalized by McGoveran [4,5]. 

A relational system is comprised of a RDBMS and the relational databases it manages.



Note: I will not publish or respond to anonymous comments. If you have something to say, stand behind it. Otherwise don't bother, it'll be ignored.


References

[1] Pascal, F., Database Design: What It Is and Isn't

[2] Pascal, F., Normalization and Further Normalization, Parts 1-2

[3] Pascal, F., Logical Validity and Semantic Correctness

[4] McGoveran, D., LOGIC FOR SERIOUS DATABASE FOLK (draft chapters), forthcoming

[5] Pascal, F., Re-interpreting Codd



No comments:

Post a Comment

View My Stats