UPDATING VIEWS PART 5: JOINS AND OTHER VIEWS
by Chris Date and David McGoveran

 

 

 

(Continued from Part 3)

 

 

5. Updating Joins

 

Most previous treatments of the view update problem--including those by one of the present authors (Date)--have argued that the updatability or otherwise of a given join depends, at least in part, on whether the join is one-to-one, one-to-many, or many-to-many.  By contrast, it is the contention of the present paper that joins are always updatable.  Moreover, the update rules are identical in all three cases, and are essentially quite straightforward.

 

What makes this claim plausible--startling though it might seem at first sight--is the new perspective on the problem that is afforded by adoption of the fundamental principle stated at the beginning of the "Preliminaries" section in this series.  Broadly speaking, the overall objective of support for the view mechanism has always been to make views look as much like base tables as possible, and this objective is indeed a laudable one.  With regard to view update specifically, however:

 

Ø       It is usually assumed (implicitly) that it is always possible to update an individual row of a base table independently of all the other rows in that base table.

 

Ø       By contrast, it is manifestly not always possible to update an individual row of a view independently of all the other rows in that view.  For example, Codd shows that it is not possible to delete just one row from a certain join, because the effect would be to leave a table that "is not the join of any two tables whatsoever" (which means that the result could not possibly satisfy the table predicate for the view).  And the approach to such view updates historically has always been to reject them altogether, on the grounds that it is impossible to make them look completely like base table updates. 

 

Our approach is rather different.  We recognize the fact that even with a base table, it is not always possible to update individual rows independently of all the rest.  (Consider what would happen, for example, if the suppliers base table were subject to the constraint that either suppliers S1 and S4 both appear or neither of them does.)  Typically, therefore, we accept those view updates that have historically been rejected, interpreting them in an obvious and logically correct way to apply to the underlying table(s); we accept them, moreover, in full recognition of the fact that updating those underlying tables might well have side-effects on the view--side-effects that are, however, required in order to avoid the possibility that the view might violate its own table predicate.

 

With that preamble out of the way, let us now get down to detail.  In what follows, we first define our terms.  Then we present the update rules for joins.  Then we consider the implications of those rules for each of the three cases (one-to-one, one-to-many, many-to-many) in turn.

 

First of all, then, we take the term "join" to mean natural join specifically.  Let the columns of table A be partitioned into two disjoint groups, X and Y say.  Likewise, let the columns of table B be partitioned into two disjoint groups, Y and Z say.  Now suppose that the columns of Y (only) are common to the two tables, so that the columns of X are "the other columns" of A and the columns of Z are "the other columns" of B.  Suppose also that corresponding columns of Y (i.e., columns with the same name) are defined on the same domain.  Finally, regard each of X, Y, and Z as a single composite column.  Then the expression

 

A JOIN B

 

yields a table with columns {X,Y,Z} consisting of all rows <x,y,z> such that the row <x,y> appears in A and the row <y,z> appears in B.  The table predicate PJ for J = A JOIN B is thus

 

PA ( a ) AND PB ( b )

 

where for a given row j of the join, a is the "A-portion" of j (i.e., the row that is derived from j by projecting away the value j.Z) and b is the "B-portion" of j (i.e., the row that is derived from j by projecting away the value j.X).  In other words:

 

"Every row in the join is such that the A-portion satisfies PA and the B-portion satisfies PB."

 

For example, the table predicate for the join of tables S and SP over S# is as follows:

 

"Every row <s#,sname,status,city,p#,qty> in the join is such that the row <s#,sname,status,city> satisfies the table predicate for S and the row <s#,p#,qty> satisfies the table predicate for SP."

 

Here then are the update rules for J = A JOIN B:

 

Ø       INSERT:  The new row j must satisfy PJ.  If the A-portion of j does not appear in A, it is inserted into A (Note that this INSERT might have the side-effect of inserting the B-portion into B also, as with INSERTs on, e.g., unions or intersections. Analogous remarks apply to the DELETE and UPDATE rules also; for brevity, we do not bother to spell out all such possibilities here. If the B-portion of j does not appear in B, it is inserted into B.

 

Note:  The specific procedural manner in which the foregoing rule is stated ("insert into A, then insert into B") should be understood purely as a pedagogical device; it should not be taken to mean that the DBMS would execute exactly that procedure in practice.  Indeed, the principle of symmetry--No. 5 from the "Preliminaries" section--implies as much, because neither A nor B has precedence over the other.  Analogous remarks apply to all of the other rules below.

 

Ø       DELETE:  The A-portion of the row to be deleted is deleted from A and the B-portion is deleted from B.

 

Ø       UPDATE:  The row to be updated must be such that the updated version satisfies PJ.  The A-portion is deleted from A, without performing any triggered actions or table predicate checks, and the B-portion is deleted from B, again without performing any triggered actions or table predicate checks.  Then, if the A-portion of the updated version of the row does not appear in A, it is inserted into A; if the B-portion does not appear in B, it is inserted into B.

 

Let us now examine the implications of these rules for the three different cases. 

 

Case 1 (one-to-one):

 

The term "one-to-one" here would more accurately be "(one-or-zero)-to-(one-or-zero)."  In other words, there is a DBMS-known integrity constraint in effect that guarantees that for each row of A there is at most one matching row in B and vice versa.  More precisely, the set of columns Y over which the join is performed must include a subset (not necessarily a proper subset) K, say, such that K is a candidate key for A and a candidate key for B.

 

Examples: 

 

Ø       For a first example, the reader is invited to consider the effect of the foregoing rules on the join of the suppliers table S to itself over supplier numbers (only).

 

Ø       By way of a second example, suppose the suppliers-and-parts database includes another base table, SR { S#, REST }, where S# identifies a supplier and REST identifies that supplier's favorite restaurant.  Assume that not all suppliers in table S are represented in table SR.  The reader is invited to consider the effect of the foregoing rules on the join of tables S and SR (over S#).  What difference would it make if a given supplier could be represented in table SR and not in table S?

 

Case 2 (one-to-many):

 

The term "one-to-many" here would more accurately be "(zero-or-one)-to-(zero-or-more)."  In other words, there is a DBMS-known integrity constraint in effect that guarantees that for each row of B there is at most one matching row in A.  Typically, what this means is that the set of "common columns" Y over which the join is performed must include a subset (not necessarily a proper subset) K, say, such that K is a candidate key for A and a matching foreign key for B (If this is in fact the case, and if (as we would strongly recommend) that foreign key has "nulls not allowed," we can replace the phrase "zero or one" by "exactly one." 

 

Examples: 

 

Let view SSP be defined as

 

S JOIN SP

 

(this is a foreign-key-to-matching-candidate-key join, of course).  Sample values are shown in Fig. 6.

 

SSP

+-----------------------------------------+

¦ S# ¦ SNAME ¦ STATUS ¦ CITY   ¦ P# ¦ QTY ¦

+----+-------+--------+--------+----+-----¦

¦ S1 ¦ Smith ¦     20 ¦ London ¦ P1 ¦ 300 ¦

¦ S1 ¦ Smith ¦     20 ¦ London ¦ P2 ¦ 200 ¦

¦ S1 ¦ Smith ¦     20 ¦ London ¦ P3 ¦ 400 ¦

¦ S1 ¦ Smith ¦     20 ¦ London ¦ P4 ¦ 200 ¦

¦ S1 ¦ Smith ¦     20 ¦ London ¦ P5 ¦ 100 ¦

¦ S1 ¦ Smith ¦     20 ¦ London ¦ P6 ¦ 100 ¦

¦ S2 ¦ Jones ¦     10 ¦ Paris  ¦ P1 ¦ 300 ¦

¦ S2 ¦ Jones ¦     10 ¦ Paris  ¦ P2 ¦ 400 ¦

¦ S3 ¦ Blake ¦     30 ¦ Paris  ¦ P2 ¦ 200 ¦

¦ S4 ¦ Clark ¦     20 ¦ London ¦ P2 ¦ 200 ¦

¦ S4 ¦ Clark ¦     20 ¦ London ¦ P4 ¦ 300 ¦

¦ S4 ¦ Clark ¦     20 ¦ London ¦ P5 ¦ 400 ¦

+-----------------------------------------+

Fig. 6: View SSP (sample values)

 

 

An attempt to insert the row <S4,Clark,20,London,P6,100> into SSP will succeed, and will have the effect of inserting the row <S4,P6,100> into table SP (thereby adding a row to the view).

 

An attempt to insert the row <S5,Adams,30,Athens,P6,100> into SSP will succeed, and will have the effect of inserting the row <S5,P6,100> into table SP (thereby adding a row to the view).

 

An attempt to insert the row <S6,Green,20,London,P6,100> into SSP will succeed, and will have the effect of inserting the row <S6,Green,20,London> into table S and the row <S6,P6,100> into table SP (thereby adding a row to the view).

 

Note:  Suppose for the moment that it is possible for SP rows to exist without a corresponding S row.  Suppose moreover that table SP already includes some rows with supplier number S6 (but not one with supplier number S6 and part number P1).  Then the INSERT in the example just discussed will have the effect of inserting some additional rows into the view--namely, the join of the row <S6,Green,20,London> and those previously existing SP rows for supplier S6. 

 

·   An attempt to insert the row <S4,Clark,20,Athens,P6,100> into SSP will fail (why?).

·   An attempt to insert the row <S5,Adams,30,London,P6,100> into SSP will fail (why?).

·   An attempt to insert the row <S1,Smith,20,London,P1,400> into SSP will fail (why?).

·   An attempt to delete the row <S3,Blake,30,Paris,P2,200> from SSP will succeed, and will have the effect of deleting the row <S3,Blake,30,Paris> from table S and the row <S3,P2,200> from table SP.

·   An attempt to delete the row <S1,Smith,20,London,P1,300> from SSP will "succeed" (see the note below) and will have the effect of deleting the row <S1,Smith,20,London> from table S and the row <S1,P1,300> from table SP.

 

Note:  Actually the overall effect of this attempted DELETE will depend on the foreign key delete rule from SP.S# to S.S#.  If the rule is RESTRICT the overall operation will fail.  If it is CASCADE it will have the side-effect of deleting all other SP rows for supplier S1 as well.  Other possibilities are left as an exercise for the reader. 

 

·   An attempt to update the SSP row <S1,Smith,20,London,P1,300> to <S1,Smith,20,London,P1,400> will succeed, and will have the effect of updating the SP row <S1,P1,300> to <S1,P1,400>.

·   An attempt to update the SSP row <S1,Smith,20,London,P1,300> to <S1,Smith,20,Athens,P1,400> will succeed, and will have the effect of updating the S row <S1,Smith,20,London> to <S1,Smith,20,Athens> and the SP row <S1,P1,300> to <S1,P1,400>.

·   An attempt to update the SSP row <S1,Smith,20,London,P1,300> to <S6,Smith,20,London,P1,300> will "succeed" (see the note below) and will have the effect of updating the S row <S1,Smith,20,London> to <S6,Smith,20,London> and the SP row <S1,P1,300> to <S6,P1,300>.

 

Note:  Actually, the overall effect of this attempted update will depend on the foreign key update rule from SP.S# to S.S#.  The details are left as another exercise for the reader.

 

Case 3 (many-to-many):

 

The term "many-to-many" here would more accurately be "(zero-or-more)-to-(zero-or-more)."  In other words, there is no DBMS-known integrity constraint in effect that guarantees that we are really dealing with a Case 1 or Case 2 situation. 

 

Examples: 

 

Let view SCP be defined as

 

S JOIN P

 

(join of S and P over CITY--a many-to-many join).  Sample values are shown in Fig. 7.

 

SC

+------------------------------------------------------------+

| S# ¦ SNAME ¦ STATUS ¦ CITY   ¦ P# ¦ PNAME ¦ COLOR ¦ WEIGHT ¦

+----+-------+--------+--------+----+-------+-------+--------¦

¦ S1 ¦ Smith ¦     20 ¦ London ¦ P1 ¦ Nut   ¦ Red   ¦     12 ¦

¦ S1 ¦ Smith ¦     20 ¦ London ¦ P4 ¦ Screw ¦ Red   ¦     14 ¦

¦ S1 ¦ Smith ¦     20 ¦ London ¦ P6 ¦ Cog   ¦ Red   ¦     19 ¦

¦ S2 ¦ Jones ¦     10 ¦ Paris  ¦ P2 ¦ Bolt  ¦ Green ¦     17 ¦

¦ S2 ¦ Jones ¦     10 ¦ Paris  ¦ P5 ¦ Cam   ¦ Blue  ¦     12 ¦

¦ S3 ¦ Blake ¦     30 ¦ Paris  ¦ P2 ¦ Bolt  ¦ Green ¦     17 ¦

¦ S3 ¦ Blake ¦     30 ¦ Paris  ¦ P5 ¦ Cam   ¦ Blue  ¦     12 ¦

¦ S4 ¦ Clark ¦     20 ¦ London ¦ P1 ¦ Nut   ¦ Red   ¦     12 ¦

¦ S4 ¦ Clark ¦     20 ¦ London ¦ P4 ¦ Screw ¦ Red   ¦     14 ¦

¦ S4 ¦ Clark ¦     20 ¦ London ¦ P6 ¦ Cog   ¦ Red   ¦     19 ¦

+------------------------------------------------------------+

Fig. 7: The join of S and P over CITY

 

 

·   Inserting the row <S7,Brown,15,Oslo,P8,Wheel,White,25> will succeed, and will have the effect of inserting the row <S7,Brown,15,Oslo> into table S and the row <P8,Wheel,White,25,Oslo> into table P (thereby adding the specified row to the view).

·   Inserting the row <S1,Smith,20,London,P7,Washer,Red,5> will succeed, and will have the effect of inserting the row <P7,Washer,Red,5,London> into table P (thereby adding two rows to the view--the row <S1,Smith,20,London,P7,Washer,Red,5> (as specified) and the row <S4,Clark,20,London,P7,Washer,Red,5>).

·   Inserting the row <S6,Green,20,London,P7,Washer,Red,5> will succeed, and will have the effect of inserting the row <S6,Green,20,London> into table S and the row <P7,Washer,Red,5,London> into table P (thereby adding six rows to the view).

·   Deleting the row <S1,Smith,20,London,P1,Nut,Red,12> will succeed, and will have the effect of deleting the row <S1,Smith,20,London> from table S and the row <P1,Nut,Red,12,London> from table P (thereby deleting four rows from the view).

 

Further examples are left as an exercise for the reader.

 

 

(The authors would like to thank Nagraj Alur, Hugh Darwen, Fabian Pascal, and Paul Winsberg for their helpful comments on earlier drafts of this article.)

 

 

Comments On Republication: Originally published in Database Programming & Design 7, No. 6 (June 1994) and published as a two-part article in RELATIONAL DATABASE WRITINGS 1991-94. It is republished here by permission of David McGoveran, Miller Freeman Inc. and Pearson Education, Inc. © All rights reserved by C.J. Date. Research has shown that certain detail level corrections might be needed, which we may undertake in the future. However, we still believe strongly that the overall approach is sound.

 

(Continued in Part 6)

 

 

Posted 01/10/03

 

 

 

[ABOUT] [QUOTES] [LINKS]