Wednesday, January 22, 2020

TYFK: “Why is a relational database so powerful?”




 Note: About TYFK posts (Test Your Foundation Knowledge) see the post insert below.
“...the theoretical awesomeness of relational algebra is kinda hard to intuitively relate back to your payroll or audit-log tables - the real power is the computed join ... it lets you dynamically fetch sets of data in the exact format that you need ... with any group of tables in the dataset. Unlike other data models, where the things you can fetch are typically fixed when you define your elements, and where relationships between data - if any - are statically defined in advance ... joins let you specify the relationships between objects (rows and tables in SQL-based relational databases) ... create queries and run them on your data without needing to write a lot of extra code beyond the SQL itself. This Ad Hoc Query capability ends up being hugely valuable when doing "secondary" business tasks in a big data world such as doing reporting and analytics, and is often hugely difficult to do in non-relational environments without a lot of extra code and often a specialized reporting schema.”

“Relational theory as applied to databases is that all data is connected to each other, keyed to each piece ... And with a SQL query [you can] create anything that can exist, as output.”

“...main so called power of RDBMs lies within ACID compliance. A transaction in a RDB is Atomic, Consistent, isolated, durable ... makes a database useful, unique, or I suppose powerful ... let's say you update or insert some new record, right in the middle power goes out, due to ACID compliance your transaction will not go through ... either the operation will complete or fail, nothing in between. [And RDBMSs are] tried, tested and true for almost 50 years.”

First try to detect the misconceptions then check against our debunking (some of the above is correct per se but not directly relevant to relational power, some only partially correct, and some is wrong. Can you discern which is which?) If there isn't a match, you can acquire the necessary foundation knowledge in our POSTS, BOOKS, PAPERS, LINKS or, better, organize one of our on-site SEMINARS, which can be customized to specific needs.
------------------------------------------------------------------------------------------------------------------

SUPPORT THIS SITE 

Up to 2018, DBDebunk was maintained and kept free with the proceeds from my @AllAnalitics column. In 2018 that website was discontinued. The content of this site is not available anywhere else, so if you deem it useful, particularly if you are a regular reader, please help upkeep it by purchasing publications, or donating. Thank you. 

DATA FUNDAMENTALS 

The industry is chockful of misconceptions due to lack of foundation knowledge. Corrections them are dismissed as "theory that is not practical", misinterpreted as "ad-hominem attacks", or ignored altogether, regardless of the amount and quality of reasoning and supporting evidence. Most practitioners -- be it user or vendor personnel -- cannot discern fallacies and do not realize the practical implications thereof and, thus, cannot associate problems with their real causes., hence the industry's "cookbook approach" and succession of fads.

What about you? Are you just a practitioner, or a thinking professional? 

TYFK (Test Your Foundation Knowledge) posts will each present and debunk a pronouncement containing one or more misconceptions. First try to detect them, then check against our debunking. If there isn't a match, you can acquire the necessary foundation knowledge in our POSTS, BOOKS, PAPERS, LINKS or, better, organize one of our on-site SEMINARS, which can be customized to specific needs. 

NEW

  • 01/14/20 Updated the LINKS page
  • 01/04/20 Updated the POSTS page with the 2020 posts
  • 12/08/19 Added two educational references on set theory to the LINKS page.

LATEST PUBLICATIONS (order PAPERS and BOOKS) 


USING THIS SITE

  • To work around Blogger limitations, the labels are mostly abbreviations or acronyms of the terms listed on the FUNDAMENTALS page. For detailed instructions on how to understand and use the labels in conjunction with the that page, see the ABOUT page. The 2017 and 2016 posts, including earlier posts rewritten in 2017 were relabeled accordingly. As other older posts are rewritten, they will also be relabeled. For all other older posts use Blogger search. 
  • Following the discontinuation of AllAnalytics site, the links to my columns there no longer work. I moved only the 2017 columns to dbdebunk, within which only links to sources external to AllAnalytics may work or not.

SOCIAL MEDIA 

I deleted my Facebook account. You can follow me:

  • @DBDdebunk on Twitter: will link to new posts to this site, as well as To Laugh or Cry? and What's Wrong with This Picture? posts, and my exchanges on LinkedIn.
  • @The PostWest blog: Evidence for Antisemitism/AntiZionism – the only universally acceptable hatred – as the (traditional) response to the existential crisis of decadence and decline of Western (including the US)
  • @ThePostWest Twitter page where I comment on global #Antisemitism/#AntiZionism and the Arab-Israeli conflict.
 ------------------------------------------------------------------------------------------------------------------

The Misconceptions

  • Databases have no power, DBMSs that manage them do.
  • The RDM is an adaptation and application of mathematical relation theory to database management -- its applicability to database practice is intuitive only with proper education on data fundamentals, but practitioners learn relational jargon from tool experience and without understanding of what it means and the practical implications.
  • The power of the relational algebra (RA) comes from its set operations -- not just join -- and their nestability due to relational closure.
  • A relational database consists of relations, R-tables are just visualizations of relations on a physical medium and play no part in the RDM (SQL tables are not relations, one reason SQL DBMSs and databases are not relational).
  • To the extent that the only other model claimed to be a data model in the sense in which the RDM is one -- the "graph data model" (GDM) -- can be considered such, it is a database solution for conceptual models focused on relationships individual entities (e.g., social networks).

Note: The GDM (hierarchic is a special case of network) has not yet been fully formalized as the RDM -- what are (precisely, please!) its structural-integrity components? (there is no such thing as a NoSQL data model). 


  • Be that as it may, in these "data models the things you can fetch" are not "fixed when you define your elements", and "relationships between data, if any" are not "statically defined in advance"; rather, there is complexity and rigidity to change due to procedurality and lack of data independence.
  • "Without needing to write a lot of extra code beyond the SQL itself" is a poor expression of a critical aspect of the RDM: declarative relationally complete data sublanguages hosted by computationally complete programming languages; SQL was intended as such, but is not exactly that.
  • Relational database objects are relations (with attributes defined on simple domains) and tuples, not "tables and rows".
  • Relationships in a conceptual model understood by users are represented by constraints in relational databases and enforced by the DBMS in the database.
  • There are several types of relationships and, hence, of constraints "understood" algorithmically and enforced by the DBMS, which guarantees correctness of the data and query results - logical validity and semantic consistency ("all data is connected to each other, keyed to each piece" is a poor expression of this, and not very enlightening).
  • Transaction management with ACID compliance is a DBMS function that is not part of, and is orthogonal to the RDM -- witness support by non-relational SQL DBMSs.

Note: There are reasons for which true RDBMSs are more amenable transaction support that are beyond the scope of this discussion.

The Answer


RDBMS power comes from the dual formal theoretical foundation (SST/FOPL) of the RDM, and specifically:
  • Nestable set operations of the RA [1];
  • Declarativity of relational data sublanguages [1];
  • Data independence [2,3];
  • System-guaranteed provable correctness [4]. 


Note: I will not publish or respond to anonymous comments. If you have something to say, stand behind it. Otherwise don't bother, it'll be ignored.


References

[1] Pascal, F., Data Sublanguage Parts 1-4

[2] Pascal, F., Physical Independence Parts 1-4

[3] Pascal, F., On View Updating 

[4] Pascal, F., Logical Validity and Semantic Consistency



No comments:

Post a Comment

View My Stats