Monday, July 20, 2020

Oldies But Goodies: Data Independence and "Physical Denormalization"

Note: I am re-publishing some of the articles and reader exchanges from the old DBDebunk (2000-06). How well do they hold up -- have industry knowledge and practice progressed? Judge for yourself and appreciate the difference between a sound foundation and the fad-driven cookbook approach.

January 2, 2001

ML: ... one of the "4 great lies" is "I denormalize for performance." You state that normalization is a logical concept and, since performance is a physical concept, denormalization for performance reasons is impossible (i.e., it doesn't make sense). What term would you use to describe changing the physical database design to be different from the logical design to enhance performance? Because normalization is a logical concept, you imply that this is not called denormalization.


DBDebunk was maintained and kept free with the proceeds from my @AllAnalitics column. The site was discontinued in 2018. The content here is not available anywhere else, so if you deem it useful, particularly if you are a regular reader, please help upkeep it by purchasing publications, or donating. On-site seminars and consulting are available.Thank you.

- 01/14/20 Updated the LINKS page
- 01/04/20 Updated the POSTS page with the 2020 posts
- 12/08/19 Added two educational references on set theory to the LINKS page.
- 08/19 Logical Symmetric Access, Data Sub-language, Kinds of Relations, Database Redundancy and Consistency, paper #2 in the new UNDERSTANDING THE REAL RDM series.
- 02/18 The Key to Relational Keys: A New Understanding, a new edition of paper #4 in the PRACTICAL DATABASE FOUNDATIONS series.
- 04/17 Interpretation and Representation of Database Relations, paper #1 in the new UNDERSTANDING THE REAL RDM series.
- 10/16 THE DBDEBUNK GUIDE TO MISCONCEPTIONS ABOUT DATA FUNDAMENTALS, my latest book (reviewed by Craig Mullins, Todd Everett, Toon Koppelaars, Davide Mauri).
- To work around Blogger limitations, the labels are mostly abbreviations or acronyms of the terms listed on the FUNDAMENTALS page. For detailed instructions on how to understand and use the labels in conjunction with the that page, see the ABOUT page. The 2017 and 2016 posts, including earlier posts rewritten in 2017 were relabeled accordingly. As other older posts are rewritten, they will also be relabeled. For all other older posts use Blogger search.
- The links to my columns there no longer work. I moved only the 2017 columns to dbdebunk, within which only links to sources external to AllAnalytics may work or not.
I deleted my Facebook account. You can follow me:
- @DBDdebunk on Twitter: will link to new posts to this site, as well as To Laugh or Cry? and What's Wrong with This Picture? posts, and my exchanges on LinkedIn.
- @The PostWest blog: Evidence for Antisemitism/AntiZionism – the only universally acceptable hatred – as the (traditional) response to the existential crisis of decadence and decline of Western (including the US)
- @ThePostWest Twitter page where I comment on global #Antisemitism/#AntiZionism and the Arab-Israeli conflict.

DBDebunk: What I think you are referring to is data independence: it's not that the logical design is "different" than the physical -- it is trivially so by definition -- but rather that the logical is insulated of [and thus independent of] the physical. That means that it should be possible to make any physical changes deemed necessary for performance without affecting what users see at the logical level.

Chris Date calls [those physical changes] "physical denormalization". While I understand what he means, I don't particularly like the term because full normalization is a set of [logical] design principles based on column dependencies ... I call such changes simply physical [re-design/re-implementation].

ML: Aha! I now understand what data independence means. Perhaps this is where views come in ... views facilitate the insulation of logical from physical. I have never used views before, nor have I had a notion of their practicality other than security (until now).

DBDebunk: Well, yes and no. Views insulate [users and] applications from [certain] non-loss logical changes made to base tables [schema changes]...

Comments on Republication

  • Normalization (to 1NF) refers to elimination of nested relations; further normalization beyond 1NF (to 5NF) refers to ensuring that all non-key attributes are functionally dependent on the primary key (i.e., a database relation represents (facts about) entities of a single type. A relation is in 5NF by definition, otherwise it is not a relation.
  • "Physical denormalization" would contribute to logical-physical confusion (LPC).
  • First generation SQL DBMSs had a "direct image" implementation: data of each SQL base table (which are not relations!) was stored in a single physical file and it was in this sense that the physical was considered "the same as" the logical. Subsequently this 1:1 correspondence was eliminated, hence physical "different" from the logical.
  • I do no longer substitute R-table, rows and columns for relation, tuples and attributes and explicitly state that the former are only visualizations of the latter that do not play any role in the RDM.
  • Security should be achieved via the security mechanism, not views.

No comments:

Post a Comment

View My Stats