From: OR
To: Editor
Date: 09/29/2003
Chris,
I have been reading your 8th edition (haven’t read previous
editions). You want to change C in ACID to mean correctness instead of
consistency. The only place I have found where you try to justify this is on
page 449 where you say "we want correctness, not mere consistency".
In chapter 9 you say, "The system cannot enforce truth,
only consistency" I assume truth here means correctness. So I do not
understand how would you want to change the C to correctness which is clearly
out of reach of DBMS. Especially when consistency is so well and elegantly
defined as being a state that obeys all the integrity constraints.
To me it seems that the consistency property is exactly
violated inside a transaction and anywhere else everything is kept consistence.
So I would rather change ACID to AMIID (Atomic, momentarily inconsistent,
isolated, durable). Truth and correctness have no clear meaning without the
context of the discourse.
As a side note I would not emphasize the correspondence of
propositions in database to the state of the real world because there is no
consensus about that state. Plus the "real world" seems to be in a
state of flux. I think the correspondence theory of truth has quite
fundamentalist (in the religious sense) roots. "Real world" could be
replaced for example with "world of discourse".
How does multiple assignment operator differ from a
transaction? Let's take an example:
Assume that column a in table tb1 has to be between 1 - 100
BEGIN TRANSACTION;
UPDATE tbl SET a =
250 where id = 1;
INSERT INTO tbl-2
(select a from tbl where id = 1);
UPDATE tbl SET a =
90 where id = 1;
COMMIT;
UPDATE tbl SET a = 250 where id = 1,
INSERT INTO tbl-2 (select a from tbl where id = 1),
UPDATE tbl SET a = 90 where id = 1;
The transaction would pass because at the transaction
boundary everything is consistent.
How would the multiple assignment operation behave? What
state would the select inside the insert see?
Chris Date Responds:
Thanks for your kind remarks! As
you more or less state in your message, your questions all have to do with
topics that are discussed at some length in my book AN INTRODUCTION TO DATABASE
SYSTEMS, 8th Ed. (referred to hereinafter for convenience as just
"the book"). For convenience,
I summarize the most important page references from that book here:
·
Correctness vs. consistency: pp 263-265
·
Immediate constraint checking: pp 261,486-488
·
Multiple assignment: pp 124,158,450
Regarding the C in ACID, I want it to stand for correctness,
not consistency, because--as I explain in the book--consistency is trivial.* Of course, I do
understand that correctness
is only a desideratum (it can't be guaranteed), but to say that the C stands
for consistency is, to my mind, to say nothing interesting at all. We've all been hoodwinked over this one for
many years! (Actually I think we've
been hoodwinked over the A and the I and the D properties, too, but that's a
topic for another day.)
* In a logical sense, that is. I recognize there might be problems in
implementation, but we're concerned here with the model, not the
implementation.
You say, "the consistency property is exactly violated
inside a transaction." This is the
classic position, of course, but the book explains why I think the classic
position has to be rejected. It's the statement,
not the transaction, that has to be "the unit of integrity." (That's why, if the C stands for
consistency, it's trivial.) Here's your
sample transaction:
BEGIN TRANSACTION;
UPDATE tbl SET a = 250 WHERE id = 1;
INSERT INTO tbl-2
(select a from tbl where id = 1);
UPDATE tbl SET a = 90
WHERE id = 1;
COMMIT;
Given the constraint that values of column a in table tb1
have to be "between 1 - 100" (by the way, what exactly does
"between" mean here?), I would say the first UPDATE must fail, and
therefore the overall transaction must fail as well. (I know that's not how DBMSs typically behave today, but
it's
precisely part of my argument that today's DBMSs are logically flawed in this
area.)
You go on to ask what would happen if we replaced the UPDATE
- INSERT - UPDATE sequence (i.e., three separate statements) by the following
single statement (a multiple assignment, as you rightly characterize it):
UPDATE tbl SET a = 250 WHERE id = 1,
INSERT INTO tbl-2 (select a from tbl where id = 1),
UPDATE tbl SET a = 90 WHERE id = 1;
To cut a long story short, the "a" value that gets
inserted into tbl-2 here is whatever it happened to be before the multiple
assignment is executed. I don't know
what that value is, but it must be between 1 and 100--whatever that means--and
thus certainly can't be 250. (For the
record, the final "a" value for id = 1 in table tb1 will be 90.) The following, by contrast, will
get that
250 into tbl-2:
UPDATE tbl SET a = 250 WHERE id = 1,
INSERT INTO tbl-2 (select 250 from tbl where id = 1),
UPDATE tbl SET a = 90 WHERE id = 1;
So, perhaps more realistically, will this two-statement
sequence:
SET x = 250;
UPDATE tbl SET a = x WHERE id = 1,
INSERT INTO tbl-2 x,
UPDATE tbl SET a = 90 WHERE id = 1;
With my colleague Hugh Darwen, I am currently drafting a
paper on this whole topic of multiple assignment: why it's needed and why it
has to have the semantics I'm explaining (briefly) here. We hope to have that paper ready for
publication soon.
One last point: You suggest replacing references to "the
real world" by references to "the world of discourse." I'm very sympathetic to this
suggestion! However, "the real world" is one of those terms
(like "data independence," for another example) that have become established by long usage, despite
the fact that they are sometimes not very apt. Although I'm always fighting for good terminology over bad, I
think this particular battle was probably lost years ago.
Posted
12/05/03
[ABOUT]
[QUOTES]
[LINKS]