Re-write

Monday, December 5, 2016

Follow @DBDebunk Follow @ThePostWest

See the rewrite

Class, Type, Relation and Domain in Database Management

9 comments:

UnknownDecember 5, 2016 at 4:26 PM
Codd avoided relvars by using the term "time-varying relations" instead.

Fabian: bunkum! McG is entitled to object to relvars because programming language variables have no place in set theory. (A set does not have persistent identity.) Then equally (as he says) there can be no place for a database variable.

Then there can be no "relation" with a persistent identity that could be "time-varying". The best we could say is: look at these two database (value)s; at two different times; they both have a relation with such-and-such a predicate (RP); then we might mentally construct a persistent entity which is time-varying. But that's (convenient) mythology, not licensed by set theory.

Equally, we could describe that situation by naming a relation; and adopting a convenient mythology that the name identifies a programming language variable.

The problem with identifying a relation by RP is with schema evolution: we might have two database values with two slightly different RPs. We cannot say that is one time-varying relation. (The whole idea of RPs differing "slightly" is more mental fairy-tale.)

So I do not see why McG is so critical of relvars. (He as good as admits he's being over-precious.) We can regard a database value as a set of (name, relation) pairs, where the name fills the role of a programming variable.
ReplyDelete
Replies
Fabian PascalDecember 5, 2016 at 7:39 PM
At this time David McGoveran offers only this reaction and defers any further discussion until his formal exposition of the RDM is published in the book he's currently working on.

I appreciate the fact that Clayden approves of my objection to relvars on the basis of set theory. On the other hand, he seems to completely ignore the problem relvars introduce into the language vis-à-vis computational completeness when he says he doesn't understand why I am so critical. BTW, I'm not sure what word was intended where he uses "precious", but it made me smile.

I also don't understand his comment regarding schema evolution, especially inasmuch as his example seems only to reinforce my position that relation predicates (RP) do accurately identify relations (which is not only consistent with set theory, BTW, but with EFC as early as 1969--see "set specification"). That said, I've always said that relational theory has not addressed so-called "schema evolution" from any theoretical basis. I've also said it needs to be done properly.

As to RPs differing only slightly being "mental fairy tale", I suggest that you can only make such judgments if you both understand how to write RPs in formal detail and define what it means for them to differ or be related, slightly or otherwise. I've defined such differences in terms of the deductive apparatus of FOPL and you can't get much more mechanical than that.
ReplyDelete
Replies
toledobytheseaDecember 9, 2016 at 8:12 AM
For years the fuss about 'time-varying relations' has seemed incomprehensible to me. Surely Codd meant nothing more than references to different extensions at different times, without dictating any particular implementation language.

Also, the comment that "a set does not have persistent identity" is nonsense. The situations that relational sets represent might come and go but sets of sentences about the situations are as everlasting as anything can be. Language devices that replace the sets under consideration don't change that. And the set ops MINUs and UNION used for updating make it perfectly clear which tuples/rows 'persist'.
ReplyDelete
Replies
toledobytheseaDecember 9, 2016 at 8:38 AM
(I hope my comments won't disagree with anything David McGoveran has written, his explanations being usually much deeper/more fundamental than I could manage.)

It should be obvious that language devices such as relvars/tables often represent more than one relation, for example a relation that is a join will often have proper subsets that are also joins. Those subsets would have distinct predicate extensions and therefore different intensions from those of the value of a relvar or table. The most obvious exception would be in systems where every relvar reference specifies a key value, so that every referenced relation is a singleton and so has no subsets. (I'd guess anybody who met Codd would know what a stickler he was for keys.)

Beyond that, tuples of some relations such as joins can be projections of others.

When a relvar has only one predicate (supposedly) but can represent multiple relations, each with a different predicate, it looks to be very tricky, if it's possible at all, to use relvars or tables to explain relational theory, which is to say explain database behaviour which is to say meaning of schemas. In other words using an implementation to explain everything else rather than using the theory to explain an implementation. (This hasn't stopped SQL systems from forcing users to use base tables for all updates, even when the reference manuals call such updates 'view updates'! The SQL updating situation is exactly the same as it was/is in file systems - every file update must be individually specified by users before the system can be 'correct'. So SQL systems should more accurately be called advanced file systems. It also didn't stop one of the System R developers from claiming in 1981 that data independence had been achieved!) The result is not only not relational but ignores the overall database meaning and consistency.
ReplyDelete
Replies
toledobytheseaDecember 9, 2016 at 9:00 AM
Regarding there being no 'license' for mythology/fairy tales to vary relations, this could be a typical coder's view or even a physical view. Set operations allow differences between relations to be expressed, therefore they allow the expression of output relations that vary from the input relations. Most language implementations discard the inputs and differences after the expressions have been evaluated. McGoveran's view updating chapter gives a concise update definition/vary definition expressed as equations using set operators on inputs and differences/'transforms' and doesn't dictate anything about inputs or differences being discarded. (Any discarding would be an implementation choice, not a definition choice.)
ReplyDelete
Replies

Add comment

POSTS

Monday, December 5, 2016

Re-write

9 comments: