"DON'T BOTHER ME WITH ALL THAT THEORETICAL STUFF!"
with Fabian Pascal

 

 

 

From: BC
To: Editor

 

I just have to tell someone who understands. I just had the following conversation with the data architect of the OLTP system from which I get data for our data warehouse:

 

Self: Joe, I'm looking at the schema and I'm trying to determine what column we use for the number of individual sellable units that are in a container. There is no "container" table. How is this known?

 

Joe: Oh, you can use any. Where are you accessing the data from?

 

Self: If you mean what am I joining into the query, it shouldn't matter. What defines this fact?

 

Joe: What are you trying to do?

 

Self: Stop myself from killing you... (okay, I only thought this line...) Inevitably, I found out that if push comes to shove we would indeed trust a column in one table over the value in any of the myriad of other tables. I go back and select the distinct values from the column.

 

Self: Joe. What does a zero mean in the Dsf3345 column? (Nice names, eh? But relax, I've been assured that: "That's okay. We have all the definitions in our application tool")

 

Joe: It means there is only one sellable item in the container.

 

Self: So, what does the value one mean?

 

Joe: It means there is one sellable item in the container.

 

Self: Do you realize people write reports against this data? Some of them actually try to do math with it.

 

Joe: Why are you making such a big deal about this?

 

Self: We could of course not allow zero values in the column. That way we don't have to transform zeros into ones in each report. One constraint would do the trick.

 

Joe: We put all our integrity constraints in the application.

 

Self: Could you check all your code for all the places you update or insert into this column and ensure we never allow zeroes?

 

Joe: We're too busy for that, think of all the tables we would have to change. Just change your reports. Besides, we have to allow zeroes.

 

Self: Why's that?

 

Joe: Well, what else are they going to use for "not applicable"?

 

Our data warehouse is far more normalized than the OLTP system. (And for some strange reason it performs better.) BTW, Joe (not his real name) is an author of a textbook on OO Design Measurement.
 
 

Ed. Comment: None. See my reply to the question on "read-only" warehouses in my March Against the Grain column.

 

Posted: 03/22/02

 

 

 

[ABOUT] [QUOTES] [LINKS]