From: PA
To: Editor
I find [your article
in DM Review] to contradict your stated devotion to scientific methods and the
value of theory. You present a single example of denormalization, then proceed
to draw a conclusion about denormalization in general. In addition, the example chosen is not
typical of real world denormalizations.
In order to be
half-way consistent with your own ideals, you would need to present at a
minimum an exhaustive list of the types of denormalizations used in practice,
along with an objective list of the pros and cons of each. I would expect that if this were
undertaken,
you would end up with a more balanced view, and some exceptions to your
black-and-white conclusions.
Of course, to prove
your point scientifically would require far more effort than this, if indeed it
were at all possible to prove or disprove your statements. This brings me to my key point: if your
contention is not falsifiable, it does not belong in the realm of true science
at all, instead it belongs in the domain of mere opinion and belief.
Please tell us how you
have proved your propositions, or else refrain from claiming that you are
working from a sound scientific foundation and everyone else is somehow
misguided.
Relational algebra has
nothing to say about real-world performance.
From: Fabian Pascal
To: PA
You are confusing formal theory with empirical
theory. In the case of normalization, the theory is formal, not empirical.
To realize your error, please provide any one example of denormalization
for which my arguments in the article does not apply logically.
From: PA
No problem.
I want users to be
able to quickly retrieve total monthly sales for product A. They do this hundreds
of times a month. I create a table
keyed on Year, Month and Product, to hold the total sales. I then update the total as orders are
processed. In a completely normalized
database, the query to get the total would have to read thousands of rows of order
lines, and would be orders of magnitude slower.
Also, I would
appreciate it if you could explain how my arguments do not apply to formal
theory.
From: Fabian Pascal
First of all, as I guessed, like so many practitioners, you
do not understand what normalization is. Your example is one of storing derived
data--a form of redundancy different than redundancy due to
denormalization. If you read the chapter on redundancy in my book, you
will see that I have separate sections for denormalization and for derived
data. What is more, your example seems to refer to historic data, which
are not updated, and hence redundancy is not an issue.
My arguments apply logically to any redundancy
of data that is being updated, including your example. The only
reason you may get better performance is because you trade integrity off for
it and ignore the risk of inconsistency. Now, if practitioners knew
and understood this, and consciously decided to give that up for
performance, I would still worry, but if that's their choice, fine. The problem
is that the vast majority is completely unaware of the integrity risk and
ignores it when they denormalize, thinking that they get performance for free.
Regarding your statement that "Relational algebra has
nothing to say about real-world performance", my article says exactly
that: normalization/denormalization are logicaland cannot
possibly affect performance, which is physical by definition.
What this means is that if you get bad performance, it is not due to your
logical design, but to the physical implementation of your database and
DBMS, as well as other implementation factors. Your problem is that, like so
many, you confuse logical and physical levels and this is so entrenched in your
mind that even an article which makes every effort to disabuse you of such
confusion, cannot get thru.
The distinction between empirical and formal theory is much
beyond databases and computers--it requires an understanding of science, and
the difference between the two kinds of theory is not something that can be
explained and learned via email. If this is of interest to you, I suggest you
educate yourself on the subject, particularly if you want to engage in public
discussion on it.
Posted
08/23/02
[ABOUT]
[QUOTES]
[LINKS]