Relational database design is one of the most egregiously
abused aspects of data management. Despite having been repeatedly debunked,
“denormalization for performance” arguments continue to sway practitioners, be
they experienced or novices. This is an illusion that costs dearly and reveals
poor understanding of normalization, its advantages and practical purposes, and
unawareness of the severe costs of undernormalization, which are almost
completely ignored, by even those who profess to be experts. Poor knowledge of
such data fundamentals is both a major reason for and a consequence of SQL DBMS
deficiencies and technology regressions such as ODBMS, OLAP, and XML that have
come to haunt data management.
Even if current products—which are far from being relational,
or good DBMS implementations--did perform better with denormalized databases,
denormalization would still be hard to justify, because if the integrity
consequences of denormalization are taken into account--they cancel out
performance gains, if any.
The focus of this seminar is, by contrast, on understanding
normalization, its practical benefits, and the heavy costs of violating it for
no good reason.
OBJECTIVES
·Explains normal forms
·Documents costs of undernormalization
·Exposes “denormalization for performance” as a
dangerous illusion, explains why it is sometimes imposed by DBMS products, and
outlines the real solution to performance maximization
·Debunks some prevalent misconceptions, to demonstrate
how proper understanding of data fundamentals can help see thru industry
practices and pronouncements, to avoid costly mistakes
·Practical recommendations
OUTLINE
ØINTRODUCTION
ØSOME FUNDAMENTALS
·First Normal Form
·Further Normalization
ØUNDERSTANDING NORMALIZATION
·What Meaning Means
·Functional Dependencies
·Full Normalization
·Implications
·An Informal View
·Undernormalization Costs
·“Unbundling”
·Design Repair
·Dependency Redundancy
ØFURTHER NORMALIZATION
·Functional Dependencies (2NF-BCNF)
·The "Whole Key" and 2NF
·"Nothing But The Key" And 3NF
·BCNF
·Generalized Dependencies (4NF-5NF)
·Multi-determined Dependencies
·Characteristics
·Multivalued Dependencies and 4NF
·Join Dependencies and 5NF
·A Way to Remember
ØDATABASE DESIGN
·Rules of Thumb
·Good Design
Ø"DENORMALIZATION FOR PERFORMANCE"
·The Argument
·The Logical-Physical Confusion
·Assessing the Claim
·Database Bias
·Redundancy Control
·The Real Problem & Solution
ØMISCONCEPTIONS DEBUNKED
ØCONCLUSION AND RECOMMENDATIONS
·Poor Foundation Knowledge
·Get Terminology Straight
·Don’t Misplace Blame
·One Design Option
·Stop the illusion
AUDIENCE
Anybody involved in data management, technical and not
technical. Some data management background may or may not be helpful.
The target audience includes (but is not limited to):
§DBMS designers, implementers, and other vendor
personnel
§Database consultants
§Data and database administrators
§Product evaluators, acquirers and deployers
§IT managers
§Information modelers and database designers
§Application developers and deployers
§Data warehouse implementers
§Members of the trade media covering data management
§Academics specializing in data management topics
§Students, graduate and undergraduate
DOCUMENTATION
Workbook containing the instructor’s slides and the DATABASE FOUNDATIONS papers that serve
as text for this seminar.
INSTRUCTOR
Fabian
Pascal has a national and international reputation as an independent
technology analyst, consultant, author, and instructor of seminars, specializing
in data management. He was affiliated with Codd & Date and for 20 years
held various analytical and management positions in the private and public
sectors, has taught and lectured at the business and academic levels, and
advised vendor and user organizations on data management technology, strategy
and implementation. Clients included IBM, Census Bureau, CIA, Apple, Borland,
Cognos, UCSF, and IRS. He is founder, editor and publisher of DATABASE DEBUNKINGS, a web site
dedicated to dispelling persistent fallacies, flaws, myths and misconceptions
prevalent in the IT industry. The site publishes his own and C. J. Date’s
papers, as well as the recently re-launched PRACTICAL
DATABASE FOUNDATIONS series, dedicated to explaining the
fundamentals of data management to IT practitioners. Author of three books, he has published
extensively, including DM Review, Database Programming and Design,
DBMS, Byte, Infoworld and Computerworld. He is
author of the contrarian columns Against the Grain,Setting Matters Straight, and Test Your Foundation Knowledge, as
well as a column for the Dutch DB/Mmagazine.