ON ‘ACID’ AND MULTIPLE ASSIGNMENT
with Chris Date

 

 

 

From: OR

To: Editor

Date: 09/29/2003

 

 

Chris,

 

I have been reading your 8th edition (haven’t read previous editions). You want to change C in ACID to mean correctness instead of consistency. The only place I have found where you try to justify this is on page 449 where you say "we want correctness, not mere consistency".

 

In chapter 9 you say, "The system cannot enforce truth, only consistency" I assume truth here means correctness. So I do not understand how would you want to change the C to correctness which is clearly out of reach of DBMS. Especially when consistency is so well and elegantly defined as being a state that obeys all the integrity constraints.

 

To me it seems that the consistency property is exactly violated inside a transaction and anywhere else everything is kept consistence. So I would rather change ACID to AMIID (Atomic, momentarily inconsistent, isolated, durable). Truth and correctness have no clear meaning without the context of the discourse.

 

As a side note I would not emphasize the correspondence of propositions in database to the state of the real world because there is no consensus about that state. Plus the "real world" seems to be in a state of flux. I think the correspondence theory of truth has quite fundamentalist (in the religious sense) roots. "Real world" could be replaced for example with "world of discourse".

 

How does multiple assignment operator differ from a transaction? Let's take an example:

 

Assume that column a in table tb1 has to be between 1 - 100

 

BEGIN TRANSACTION;

      UPDATE tbl SET a = 250 where id = 1;

      INSERT INTO tbl-2 (select a from tbl where id = 1);

      UPDATE tbl SET a = 90 where id = 1;

COMMIT;

UPDATE tbl SET a = 250 where id = 1,

INSERT INTO tbl-2 (select a from tbl where id = 1),

UPDATE tbl SET a = 90 where id = 1;

 

The transaction would pass because at the transaction boundary everything is consistent.

 

How would the multiple assignment operation behave? What state would the select inside the insert see?

 

Chris Date Responds:  Thanks for your kind remarks!  As you more or less state in your message, your questions all have to do with topics that are discussed at some length in my book AN INTRODUCTION TO DATABASE SYSTEMS, 8th Ed. (referred to hereinafter for convenience as just "the book").  For convenience, I summarize the most important page references from that book here: 

 

·         Correctness vs. consistency: pp 263-265

·         Immediate constraint checking: pp 261,486-488

·         Multiple assignment: pp 124,158,450

 

Regarding the C in ACID, I want it to stand for correctness, not consistency, because--as I explain in the book--consistency is trivial.*  Of course, I do understand that correctness is only a desideratum (it can't be guaranteed), but to say that the C stands for consistency is, to my mind, to say nothing interesting at all.  We've all been hoodwinked over this one for many years!  (Actually I think we've been hoodwinked over the A and the I and the D properties, too, but that's a topic for another day.) 

 

* In a logical sense, that is.  I recognize there might be problems in implementation, but we're concerned here with the model, not the implementation. 

 

You say, "the consistency property is exactly violated inside a transaction."  This is the classic position, of course, but the book explains why I think the classic position has to be rejected.  It's the statement, not the transaction, that has to be "the unit of integrity."  (That's why, if the C stands for consistency, it's trivial.)  Here's your sample transaction: 

 

BEGIN TRANSACTION;

   UPDATE tbl SET a = 250 WHERE id = 1;

   INSERT INTO tbl-2 (select a from tbl where id = 1);

   UPDATE tbl SET a = 90 WHERE id = 1;

COMMIT;

 

Given the constraint that values of column a in table tb1 have to be "between 1 - 100" (by the way, what exactly does "between" mean here?), I would say the first UPDATE must fail, and therefore the overall transaction must fail as well.  (I know that's not how DBMSs typically behave today, but it's precisely part of my argument that today's DBMSs are logically flawed in this area.) 

 

You go on to ask what would happen if we replaced the UPDATE - INSERT - UPDATE sequence (i.e., three separate statements) by the following single statement (a multiple assignment, as you rightly characterize it): 

 

UPDATE tbl SET a = 250 WHERE id = 1,

INSERT INTO tbl-2 (select a from tbl where id = 1),

UPDATE tbl SET a = 90 WHERE id = 1;

 

To cut a long story short, the "a" value that gets inserted into tbl-2 here is whatever it happened to be before the multiple assignment is executed.  I don't know what that value is, but it must be between 1 and 100--whatever that means--and thus certainly can't be 250.  (For the record, the final "a" value for id = 1 in table tb1 will be 90.)  The following, by contrast, will get that 250 into tbl-2: 

 

UPDATE tbl SET a = 250 WHERE id = 1,

INSERT INTO tbl-2 (select 250 from tbl where id = 1),

UPDATE tbl SET a = 90 WHERE id = 1;

 

So, perhaps more realistically, will this two-statement sequence: 

 

SET x = 250;

UPDATE tbl SET a = x WHERE id = 1,

INSERT INTO tbl-2 x,

UPDATE tbl SET a = 90 WHERE id = 1;

 

With my colleague Hugh Darwen, I am currently drafting a paper on this whole topic of multiple assignment: why it's needed and why it has to have the semantics I'm explaining (briefly) here.  We hope to have that paper ready for publication soon. 

 

One last point: You suggest replacing references to "the real world" by references to "the world of discourse."  I'm very sympathetic to this suggestion!  However, "the real world" is one of those terms (like "data independence," for another example) that have become established by long usage, despite the fact that they are sometimes not very apt. Although I'm always fighting for good terminology over bad, I think this particular battle was probably lost years ago.

 

 

Posted 12/05/03

 

 

 

[ABOUT] [QUOTES] [LINKS]