"I just (almost) finished reading AN INTRODUCTION TO DATABASE SYSTEMS, 8th Ed. and I have a bit of a problem with view updatability. I'll try and explain with an example and I would be pleased if I could get some clarifications.
Let R1 be a base relation (NAME,QUALITY), with the semantics that every occurring tuple indicates that the person with the name in question has the quality in question. That is, the entity rule in the corresponding relation predicate (RP) is something like "Person identified by name (NAME) has quality (QUALITY)."
Let R2 be a similar base relation (NAME, FUNCTION), with the corresponding rule "Person identified by name (NAME) has function (FUNCTION)."Tuples in R1 could indicate that "Chris Date has smartness.", or "George Bush has linguistic difficulties." Tuples in R2 could indicate that "Chris Date has the function database lecturer", or "George Bush has the function of President of the United States.".
These two relations can be joined, yielding a view whose rule would be something along the lines of "Person identified by name (NAME) has quality (QUALITY) and has function (FUNCTION)." One tuple's proposition in this view might be that "Chris Date has smartness and the function of database lecturer."
Now suppose that Chris Date is tired of being misunderstood, and decides to retire. Implying that Chris Date no longer has a function whatsoever. A user might now observe that the proposition contained in his database (view), is no longer true. He might want to correct this and (if the update is done through this join view) he can obviously only do so by deleting the join tuple that is now false. Now, if I understand your proposed join-delete algorithm correctly, then the system should react by removing both the assertion regarding Chris Date's function *as well as* the assertion regarding Chris Date's quality? Frankly, I can hardly believe this is for real ... Stop giving lectures does not imply stop being smart and vice-versa.
More generally: deleting a tuple from a join (or intersect or difference for that matter) view, means in fact that the user is saying that "it is not true that both P(A) and P(B) are true."--where P(A) and P(B) are both propositions related to the tuples of the base relations that are joined in the view. Nothing more and nothing less. To deduce from this fact that "both P(A) and P(B) are false" (which is indeed what we mean when we delete "the A part from A and the B part from B"), simply looks like "false reasoning" to me (well, at least in my school days it was labeled that way). I therefore have come to believe that "view updatability" should be allowed to happen IF AND ONLY IF the system has PRECISELY ONE WAY to "resolve" the user's request. You state that this is "highly desirable", but I believe that's not good enough. I believe it should be a sine qua non."
C. J. Date
First of all, let me say that I no longer regard view updating as a fully solved problem. A year ago or so I thought it was--but then Hugh Darwen started to ask some hard questions and I realized I was wrong. (David McGoveran will probably disagree with me here.) That said, I remain optimistic that the problem is solvable.
That said, I remain optimistic that the problem is solvable. The discussion in my 8th Ed. is generally along the right lines, though it gets some of the details wrong. One important piece of the puzzle that wasn't mentioned in the 8th edition is what I and Darwen introduced as The Assignment Principle (Ed. Note: For CJD's current position see).
Now let me address some of the specific points you raise. The first has to do with updating joins. For reasons that should be obvious I don't want to use your specific example here; let me switch to the well-known suppliers-and-parts example. Let SSP be the join of relvars S and SP over S# (a one-to-many join; your example was many-to-many, and I'll get to that case in a moment). Consider an attempt to delete just the tuple for supplier S1 and part P1 from SSP. The rules say: Delete the tuple for S1 from relvar S and delete the tuple for S1 and P1 from relvar SP. So what happens? There are three possibilities:
- If there are no other shipments for supplier S1, the overall operation succeeds.
- If there are some other shipments for supplier S1 and no other updates are done (i.e., there are no side effects), the overall operation fails on a referential integrity violation.
- If there are some other shipments for supplier S1 and deleting supplier S1 from relvar S "cascades" to delete those shipments from relvar SP, the overall operation fails on a violation of The Assignment Principle.
DELETE ssp WHERE s# = s#('s1') AND p# = p#('p1');is inherently unsafe, since we presumably don't know, in general, whether there are any other shipments for supplier S1. The following operation, by contrast, is safe:
DELETE ssp WHERE s# = s#('s1');(it will delete supplier S1 from S and all shipments for supplier S1 from SP).
Now, you say, correctly (though I paraphrase), that to infer (NOT PA) AND (NOT PB) from NOT (PA AND PB) is "false reasoning." I agree. But I never claimed--at least, I don't think I ever claimed!--that the rule for deleting from a join was based purely on logical inference. However, I do claim that (a) it's a simple rule, (b) it's a reasonable rule, (c) it gives useful and expected results in the majority of cases, and (d) it gives predictable results in all cases. I also claim that the rule is both (e) symmetric in itself and (f) symmetric with respect to the corresponding insert rule. Finally, I claim (g) it's desirable and indeed advantageous to have a rule that doesn't depend on whether the join is one-to-one, one-to-many, or many-to-many. (Which takes care of the many-to-many case as promised.)
Without writing a paper, I do want to make a few comments in response to the objections to view updatability, at least insofar as it addresses the approach proposed by Chris Date and myself.
Let me note up front that Chris and I may not see eye-to-eye regarding view updatability if his published expositions are indicative. I suspect this divergence is largely a matter of some details of the approach which Chris more or less downplays, and which then permit many confusions (and objections) to arise whether by Hugh Darwen, the recent correspondent, or others. Unfortunately I have not been in a position to pursue clarification with Chris, so please do not take this any kind of indictment.
View updatability is of crucial importance to the relational model. If there is no solution, then there is likewise no possibility of data independence (both physical and logical!). That would be tantamount to disclaiming the relational model itself, insofar as data independence was the stated initial motivation of Dr. Codd's relational model research program.
I believe that a solution to the view updatability "problem" is not only possible, but that it is a solved problem. I do not believe that the problem has been formulated well in the literature, nor do I believe that the solution has been well and completely explained. (I'm committed to getting this done before my "function" is defunct, hence much focused work on the forthcoming book. Furthermore, I believe that the consequences of understanding and applying the solution are wide-reaching, so much so that we might have to acknowledge that we never truly understood the relational model heretofore.
A complete explanation of the solution will be offered in my forthcoming book . Here I will have to be satisfied with providing a few principles, recommended conceptualizations, or guideposts. Throughout, I've placed quotes around certain words or phrases in lower case, because they could use some further explanation, might be over-interpreted, or otherwise are imprecise. I apologize in advance for not being able to do better by this important subject here. A detour at this time into a partial explanation of view updating would mean a significant delay in writing the more complete explanation in the book, a delay I am not willing to entertain at this juncture.
- The "Date-McGoveran view update approach" (or, in more concrete form, the algorithm) provides a set of rules for applying updates to any relation.
- These rules are not to be applied in isolation. They are, in a sense, context dependent, where the context is supplied by (1) constraints accessible to the DBMS and (2) the current user's intent as expressible through (a certain strictly constructive and strictly finite) first order predicate logic. The simple expression of the rules only tell us what to do in the absence of further contextual information.
- A DBMS that supports the relational model cannot be "magic". It cannot compensate for ambiguity, intuit withheld information or assumptions, or correct expressions that violate the designer's or user's intent. If the designer fails to capture information in database design, if information is hidden from the user, or if the users incompetently fail to express their intent [Ed. Note: Or the data language is not sufficiently expressive to allow it], it will certainly produce seemingly anomalous or "surprising" results when faced with these problems.
- No logical independence mechanism can be permitted to hide relation semantics from users (the contrary would be absurd), although it seems that many writers have assumed that to be part of its intent. The derived semantics of a view is inherently distinct from that of a base relation. For example, the relation predicate for a join view necessarily involves multiple irreducible predicates, whereas that for a base relation should not [Ed. Note: Which is another way of saying the design of base relations must adhere to the POOD).
- Logical independence can hide base data and base data representation. For example, a projection view can hide the specific data found in the excluded attributes even though the predicate clearly tells us that those attributes, and data in them, must exist. Additionally, the view might not then contain any evidence of dependencies among those attributes and attributes in other relations.
- For logical independence to be possible, the semantics of relational operators must be consistent, whether the operands are derived or base relations. Although we have traditionally thought of this as requiring that we find some algorithm for view updating, this is a poor method of attack, which seems obvious when we consider that derived relations necessarily have the more complex semantics. Thus, we must solve the general case of update semantics for which updating base relations is the special case (and not the reverse).
- All relational updates should be understood as set transformations: Therefore, they are atomic, and the final "state" is not dependent on the order of any intermediate steps. Just as is the case when evaluating a logical expression (e.g., a tautology, or a relation predicate), there are no implementation dependencies (as, for example, in treating the left vs. right "MINUS" first)--only implementation errors. An optimizer incorporating such implementation dependencies is faulty.
- We must avoid the temptation to think of a relational update as anything other than a declarative specification of a resulting relation (a set) incorporating the relation predicate of source relation and its current value. Nothing else is pertinent to computing the value that is to be "assigned" to the relation. If a reference to one or more tuples occurs within the update expression, we should not think even of these as anything other than a particular way of expressing a constraint (i.e., as part of an "update predicate").
- Correct database design is crucial for understanding view updatability (Chris Date will possibly demur on this point) and goes well beyond full normalization or even our Principle of (semantically) Orthogonal Design. [Ed. Note: There are three design principles, see ]. True, the results will always be predictable and deterministic, but are likely to produce confusing side effects from time-to-time.
I disagree with Chris' apparent conclusion that view updating need not be based solely on logical inference; I insist that it can and should be. Many of the problems that have been raised regarding our proposal have to do with how predicates are to be interpreted, including the predicates that the user conveys to the DBMS in terms of insert, update, and delete. Put another way, our "updating language" is not sufficiently expressive to convey the user's intent accurately. As discussed in (although the reader is cautioned that I did not write or edit the proposal that is attributed to me and so it is not completely accurate), this deficiency can and should be remedied.
With these principles in mind, reconsider the "problem" with the example of delete through a join view. The "betrayal" of the user is an illusion and ceases to be a problem. In fact, it is a problem which we accept all the time for base relations: We can easily construct a database of base relations for which there are "side effects" (remember cascade delete?), or which prevent the user from obtaining update results that behave as if the target relation exists in isolation.
The "problem" of finding a unique inverse transformation for a join view in a given context without specifying any context goes away. Clearly, the delete must satisfy 'NOT PA OR NOT PB'. (Aside: We must be very careful what we mean by NOT, avoiding confusion with simple complement. Relations have a relative complement--tuples not asserted to be 'true'--with a scope that is distinct from that of the relation predicate--tuples that cannot belong to the relation [Ed. Note: Representing false instantiations of the predicate]). This requirement may be further constrained in at least one of two ways.
- If they know, users should specify their intent as to why the conjunction is 'false'.
- Some of the integrity constraints and dependencies the designer imposes may further constrain the general requirement.
 Pascal, F., Denormalization for Performance: Don't Blame the Relational Model
 Pascal, F., Relation Predicates and Identical Relations
 Pascal, F., The Principle of Orthogonal Database Design Part I
 Pascal, F., Principle of Orthogonal Database Design Part II
 Pascal, F., The Principle of Orthogonal Database Design Part III
 Comments on "Database Design Relation Predicates and Identical Relations"
 Comments on "Denormalization for Performance: Don't Blame the Relational Model"
 Date, C. J., VIEW UPDATING AND RELATIONAL THEORY (O'Reilly Media, 2013)
 McGoveran, D., LOGIC FOR SERIOUS DATABASE FOLKS, forthcoming.
Do you like this post? Please link back to this article by copying one of the codes below.URL: HTML link code: BB (forum) link code: