From: PK
To: Editor
Date:19 Jul 2004
According to literature Relational Algebra has five primitive
operators: SELECT, PROJECT, UNION, MINUS, and TIMES. They are called primitive
since other relational operators can be defined "in terms of" these primitive operators, whilst they themselves cannot.
I was wondering what "in terms of" in this context
exactly means. Take for example the operator INTERSECT which may be defined as:
A INTERSECT B = A MINUS ( ( A MINUS B ) UNION ( B MINUS A ) )
INTERSECT is defined in relational algebra, using only the
primitive operators MINUS and UNION. I agree to call this a definition "in
terms of" the primitive operators.
JOIN is not a primitive relational operator. It must
therefore be possible to express it "in terms of" primitive
relational operators. According to literature JOIN is calculated by a TIMES,
then a SELECT and then a PROJECT operation. If you try to express JOIN in relational algebra,you see that is not possible. The relational expression for a JOIN contains
many unfilled question marks:
A JOIN B = ( ( A TIMES B ) WHERE ? = ? ) { ? }
So on the logical level, the JOIN operator can not be
primitive. On the implementation level it's of course quite a different story.
There you can calculate the common attributes, pass them to the SELECT
operators an so forth. But I think that's not the issue.
I would kindly hear about your opinion.
C. J. Date Responds: As you say, an operator is
primitive if it can't be defined in terms of others. But which operators we choose as primitive in a given context is
often arbitrary; for example, we can define TIMES in terms of JOIN or we can
define JOIN in terms of TIMES. More
generally, let S be a set of operators.
Then we can say that the set P is a set of primitive operators
for S if and only if:
·
P is a subset (not necessarily a proper subset)
of S.
·
Every operator in S can be defined in terms of
operators in P (only).
For example, let S be the set of operators {+,-},
where the operands are (let us assume) signed integers; then we could take
either of the sets {+} and {-} as corresponding P's.
Turning now to the relational algebra specifically: It's true that much of the literature
identifies the set consisting of restrict,*
project, union, difference, and product as a primitive set, but that primitive
set is not unique. In fact, in our book
FOUNDATION FOR FUTURE
DATABASE SYSTEMS: THE THIRD MANIFESTO, 2nd Ed., Hugh Darwen and I show
that every relational algebra operator can be expressed in terms of just two
primitive operators. (And "every
relational algebra operator" here includes several operators that aren't
even discussed in much of the literature, including GROUP, UNGROUP, PACK,
UNPACK, and many others.)
----------
* I GREATLY prefer restrict
over select. Note in particular
that SELECT in SQL is closer to project than it is to restrict.
----------
I hope the foregoing is clear. I'll close with a few comments on specific points in your
original message:
·
Your definition of INTERSECT is correct but
unnecessarily complex (since A and B MINUS A are disjoint). In fact, we have:
A INTERSECT B = A MINUS ( A MINUS B )
= B MINUS (
B MINUS A )
·
Of course, INTERSECT is just a special case of JOIN
anyway:
A INTERSECT B = A JOIN B
(so long as A and B are of the
same type, which they must be for INTERSECT).
·
Your "definition" of JOIN--
A JOIN B = ( ( A TIMES B ) WHERE ? = ? ) { ? }
--if accepted would appear to show that JOIN is a
primitive operator, contrary to what you state (because if I understand you
aright you're claiming it can't be fully defined in terms of other algebraic
operators). Note in particular that the
same argument applies to both restrict and project (both of which are usually
regarded as primitive); in both cases, we need additional operands to the
operation, over and above those operands that are relations as such. Note too that, even with UNION, INTERSECT,
and MINUS, we have to impose the rule that the operands must be of the same
type, so even here we're not talking about the "pure" set-theoretic
operators of the same name.
Posted
10/22/04