From: Jonathan Leffler
Date: 28 Oct 2005
I've been documenting an overwhelming number of cases where
people approach database work without a clue about data management, who should
not be in the database business. Consequently, they post mindless requests for
help online, which Chris Date categorizes as "I don't know how to do my
job and am looking for somebody to do it for me. The following example was
relayed to me by Jonathan Leffler, to whom I responded:
Any particular reason you replied? Do you really believe
he's gonna want to bother educating himself? Wanna bet?
I deem replying to such requests with anything other than
"Go educate yourself on database fundamentals" not just useless, but
actually guaranteeing that practitioners will continue to work in a state of
ignorance.
andris_sh@yahoo.com: Here's my situation.I have about
10 tables, which contain order numbers. Those numbers are not unique inside
tables. My task is to populate and maintain separate table, let's call it
<commontable>, which would contain UNIQUE order numbers from all tables.
Here I've stopped, and I cannot get further. To get unique numbers from all
tables just once is very simple, I write something like this:
INSERT INTO commontable (ordnum)
SELECT order_number
FROM table1
UNION
SELECT order_number
FROM table2
UNION
...etc.
All works nice. But my task is also to make incremental
inserts in <COMMONTABLE>, and here problems begin. My best try was
something like this:
INSERT INTO commontable (ordnum)
SELECT order_number
FROM table1
WHERE NOT EXISTS
(SELECT ordnum
FROM commontable,table1
WHERE ordnum=order_number);
Does not work. Subquery works only once and compares only
first row of <COMMONTABLE>. After that query just tries to insert all
ORDER_NUMBER values from TABLE1, violates uniqueness, and that's all. I believe
it is quite common task, but I haven't been able to find any example or idea,
how to proceed with this.
Jonathan Leffler: I'm afraid the design of your
database is back to front. You should
ensure the uniqueness of things that need to be unique up front (which probably
means maintaining COMMONTABLE first and then having the other tables refer to
it). What constitutes the primary key
of each of the 10 tables? Why didn't
you use a consistent attribute name for the order number?
How often does a single order number appear
in several different tables - TABLE2 and TABLE7, say?
Or is the duplication limited to repetitions of the value in
TABLE1, but any number that appears in TABLE1 will never appear in TABLE2 ..
TABLE10? Are you sure about that? Why are the 10 tables separate?
What's common about them, and what's different about them?
If COMMONTABLE contains just the one attribute (column), what
are you using it for really?
From the limited information we have, it sounds as if you
should have a single table representing generic orders in some shape or form,
and then have 10 different order-type sub-tables - each of which could have a
simple referential constraint to the central order table (you're calling it
COMMONTABLE). When an order is added,
you create an entry in the central table, and also populate the relevant other
table or tables with the other specialized information.
There are still design issues (disjointness,
duplication in the sub-tables), but at least you'd have a cohesive starting
point.
I guess you're going to resist a redesign of the database.
Can you afford to use a UNION view?
How often are you going to refer to COMMONTABLE (which would,
perhaps, be better named COMMONVIEW if you adopt this idea)?
Have you considered INSERT triggers on the ten tables, each
of which does the relevant insert into COMMONTABLE if the order number is not
already listed there?
The INSERT statement should read:
INSERT INTO CommonTable(OrdNum)
SELECT Order_Number
FROM Table1
WHERE Order_Number
NOT IN (SELECT OrdNum FROM CommonTable);
This says "insert order numbers from TABLE1 into common
table where the order number does not already appear in COMMONTABLE".
The only residual issue is can you select
from COMMONTABLE as well as insert into it.
If not, select into a temp table and then insert from the temp table
into COMMONTABLE.
Given the likely realities of your situation, I think a one
time load followed by insert triggers is likely your best solution.
However, you should also develop a check
script to ensure that you haven't accidentally deleted something in common
table that is actually still in one of the other tables, or somehow evaded the
insert trigger (unlikely).
What happens when a row is updated in TABLE1 and the order
number changes? Can that happen? Do your permissions prevent it happening?
What happens when a row is deleted from TABLE1 and it was the last row that
referenced a particular value in COMMONTABLE? Do
you have to check whether the value still exists in any of the other
tables? You need to consider UPDATE and
DELETE triggers, therefore, on TABLE1 .. TABLE10.
What triggers do you need on COMMONTABLE?
What could be the possible referential
constraint on that? I don't think
there's a feasible referential constraint that can be created declaratively -
it would have to work on the UNION of the other tables, and the performance
doesn't bear thinking about. So, you
probably end up coding some sort of check manually - and it won't be fast.
That is, I'm afraid, the penalty for the
faulty database design.
Note: It takes
monumental ignorance and a certain amount of stupidity to believe, like the Monashes of the world do,
that the crisis in data management is “complexity of relational schema”. To the
extent that there is such complexity that is not inherent in the real world
which the schemas represent, it is produced by the utter ignorance of data
fundamentals within which such schemas are designed, and the prohibitive, but
completely unnecessary, design and maintenance work that they induce.
And if anybody still believes that any attempt to educate the
ignorami who insist on remaining so via online or email responses, well, good
luck to them. “Go and educate yourself” is not said because I believe they will
do so, but because anything more is a lost cause.
Posted 1/13/06
© Fabian Pascal 2006 All Rights Reserved