THE COSTLY ILLUSION
NORMALIZATION, INTEGRITY AND PERFORMANCE

 

 

 

OVERVIEW

 

Relational database design is one of the most egregiously abused aspects of data management. Despite having been repeatedly debunked, “denormalization for performance” arguments continue to sway practitioners, be they experienced or novices. This is an illusion that costs dearly and reveals poor understanding of normalization, its advantages and practical purposes, and unawareness of the severe costs of undernormalization, which are almost completely ignored, by even those who profess to be experts. Poor knowledge of such data fundamentals is both a major reason for and a consequence of SQL DBMS deficiencies and technology regressions such as ODBMS, OLAP, and XML that have come to haunt data management.

 

Even if current products—which are far from being relational, or good DBMS implementations--did perform better with denormalized databases, denormalization would still be hard to justify, because if the integrity consequences of denormalization are taken into account--they cancel out performance gains, if any.

 

The focus of this seminar is, by contrast, on understanding normalization, its practical benefits, and the heavy costs of violating it for no good reason.

 

 

OBJECTIVES

 

·   Explains normal forms

·   Documents costs of undernormalization

·   Exposes “denormalization for performance” as a dangerous illusion, explains why it is sometimes imposed by DBMS products, and outlines the real solution to performance maximization

·   Debunks some prevalent misconceptions, to demonstrate how proper understanding of data fundamentals can help see thru industry practices and pronouncements, to avoid costly mistakes

·   Practical recommendations

 

 

OUTLINE

 

Ø       INTRODUCTION

 

Ø       SOME FUNDAMENTALS

·         First Normal Form

·         Further Normalization

 

Ø       UNDERSTANDING NORMALIZATION

·         What Meaning Means

·         Functional Dependencies

·         Full Normalization

·         Implications

·         An Informal View

·         Undernormalization Costs

·         “Unbundling”

·         Design Repair

·         Dependency Redundancy

 

Ø       FURTHER NORMALIZATION

·         Functional Dependencies (2NF-BCNF)

·   The "Whole Key" and 2NF

·   "Nothing But The Key" And 3NF

·   BCNF

·         Generalized Dependencies (4NF-5NF)

·   Multi-determined Dependencies

·   Characteristics

·   Multivalued Dependencies and 4NF

·   Join Dependencies and 5NF

·         A Way to Remember

 

Ø       DATABASE DESIGN

·         Rules of Thumb

·         Good Design

 

Ø       "DENORMALIZATION FOR PERFORMANCE"

·         The Argument

·         The Logical-Physical Confusion

·         Assessing the Claim

·   Database Bias

·   Redundancy Control

·         The Real Problem & Solution

 

Ø       MISCONCEPTIONS DEBUNKED

 

Ø       CONCLUSION AND RECOMMENDATIONS

·         Poor Foundation Knowledge

·         Get Terminology Straight

·         Don’t Misplace Blame

·         One Design Option

·         Stop the illusion

 

 

AUDIENCE  

 

Anybody involved in data management, technical and not technical. Some data management background may or may not be helpful. The target audience includes (but is not limited to):

§   DBMS designers, implementers, and other vendor personnel

§   Database consultants

§   Data and database administrators

§   Product evaluators, acquirers and deployers

§   IT managers

§   Information modelers and database designers

§   Application developers and deployers

§   Data warehouse implementers

§   Members of the trade media covering data management

§   Academics specializing in data management topics

§   Students, graduate and undergraduate

 

 

DOCUMENTATION

 

Workbook containing the instructor’s slides and the DATABASE FOUNDATIONS papers that serve as text for this seminar.

 

 

INSTRUCTOR

 

Fabian Pascal has a national and international reputation as an independent technology analyst, consultant, author, and instructor of seminars, specializing in data management. He was affiliated with Codd & Date and for 20 years held various analytical and management positions in the private and public sectors, has taught and lectured at the business and academic levels, and advised vendor and user organizations on data management technology, strategy and implementation. Clients included IBM, Census Bureau, CIA, Apple, Borland, Cognos, UCSF, and IRS. He is founder, editor and publisher of DATABASE DEBUNKINGS, a web site dedicated to dispelling persistent fallacies, flaws, myths and misconceptions prevalent in the IT industry. The site publishes his own and C. J. Date’s papers, as well as the recently re-launched PRACTICAL DATABASE FOUNDATIONS series, dedicated to explaining the fundamentals of data management to IT practitioners. Author of three books, he has published extensively, including DM Review, Database Programming and Design, DBMS, Byte, Infoworld and Computerworld. He is author of the contrarian columns Against the Grain, Setting Matters Straight, and Test Your Foundation Knowledge, as well as a column for the Dutch DB/M magazine.