Thursday, October 29, 2020

OBG: Database Design and Guaranteed Correctness Part 1

Note: To demonstrate the correctness and stability of database designs provided by a sound foundation relative to the industry's fad-driven "cookbook" practices, I am re-publishing as "Oldies But Goodies" post from the old (2000-06), so that you can judge for yourself how well my arguments hold up and whether the industry has progressed beyond the misconceptions they were intended to dispel (I may break long pieces into multiple posts, and add comments and references).


DBDebunk was maintained and kept free with the proceeds from my @AllAnalitics column. The site was discontinued in 2018. The content here is not available anywhere else, so if you deem it useful, particularly if you are a regular reader, please help upkeep it by purchasing publications, or donating. On-site seminars and consulting are available.Thank you.

07/22/20: LINKS update: Added “An Argument for Controlled Natural Languages in Mathematics”, “Let’s Make Set Theory Great Again”.
- 07/21/20 LINKS update: Added “How Gödel’s Proof Works”.

- 08/19 Logical Symmetric Access, Data Sub-language, Kinds of Relations, Database Redundancy and Consistency, paper #2 in the new UNDERSTANDING THE REAL RDM series.
- 02/18 The Key to Relational Keys: A New Understanding, a new edition of paper #4 in the PRACTICAL DATABASE FOUNDATIONS series.
- 04/17 Interpretation and Representation of Database Relations, paper #1 in the new UNDERSTANDING THE REAL RDM series.
- 10/16 THE DBDEBUNK GUIDE TO MISCONCEPTIONS ABOUT DATA FUNDAMENTALS, my latest book (reviewed by Craig Mullins, Todd Everett, Toon Koppelaars, Davide Mauri).

- To work around Blogger limitations, the labels are mostly abbreviations or acronyms of the terms listed on the FUNDAMENTALS page. For detailed instructions on how to understand and use the labels in conjunction with the that page, see the ABOUT page. The 2017 and 2016 posts, including earlier posts rewritten in 2017 were relabeled accordingly. As other older posts are rewritten, they will also be relabeled. For all other older posts use Blogger search.
- The links to my columns there no longer work. I moved only the 2017 columns to dbdebunk, within which only links to sources external to AllAnalytics may work or not.

I deleted my Facebook account. You can follow me:
- @DBDdebunk on Twitter: will link to new posts to this site, as well as To Laugh or Cry? and What's Wrong with This Picture? posts, and my exchanges on LinkedIn.
- @The PostWest blog: Evidence for Antisemitism/AntiZionism – the only universally acceptable hatred – as the (traditional) response to the existential crisis of decadence and decline of Western (including the US)
- @ThePostWest Twitter page where I comment on global #Antisemitism/#AntiZionism and the Arab-Israeli conflict.


On Thinking: The Horror! 

(originally posted March 4, 2005)

"I am a long time follower of your website and enjoy it to no end. Not that I enjoy seeing people make fools of themselves, but the reading is enjoyable. I watch a number of groups and I found a thread that just is full of garbage. I know you probably get tons of these, but you really need to read this. Its amazing the things people come up with. If I wanted to think of crazy things to say, I don’t know if I could come up with these. Below is the link to the thread. I think you would find some great quotes of the week from this one."

database design question
"Keep up the fight. I have taken on our modeling tasks at our company some time ago. I fight all kinds of crazy ideas that people think will make their life easier, but really don’t."

Fabian Pascal:
I don't enjoy stupidity and ignorance either. Debunking is a dirty job, but somebody's gotta do it. [I may debunk that thread if I ever get to it].

"You're right, I should not enjoy reading threads like this. I guess I probably should have tried to respond with a correct answer to his question. I just read stuff like this and wonder what is going on in people's heads to come up with this stuff. Do they just stop thinking completely?"

Fabian Pascal:
"It's not entirely their fault, they were never taught to think [independently and critically], never rewarded [but] actually punished for such."

"I guess. It’s just so amazing to read what they say, and when you do think about it, how crazy it sounds. Over time as I get older and continue to learn, I understand first that there is always more to learn, but before accepting something, I want to know the foundation behind the concept. That generally reveals the truth. I look forward to additional updates on your site. Some of us do listen and enjoy hearing you speak on any number of topics."

Fabian Pascal:
May have something to do with your education. Currently there is none to be had really, and none is required. In fact, it is discouraged.

Incidentally, one of the comments in the thread is by Daniel Morgan, an Oracle guy who teaches at the [evening program @the] University of Washington, with whom I had an exchange out of which I posted as a weekly quote [one of his] absurd comments. Here's [his reaction] to the question that initiated the thread you referred me to:

"I have a database for a school that has been in use now for a couple of years and it is working well. There are numerous tables obviously but consider these:
  • CONTACT - holds all contacts - students, faculty, or any other type of contact - (probably should have called it Entity);
  • FACULTY - holds info about specific faculty member - foreign key to contac;
  • STUDENT - holds info about specific student - foreign key to contact.
Considering that the contact table has fields such as: LAST, FIRST, MI, ADDR, CITY, STATE, ZIP, EMAIL ... Do you think it is a good idea to have a single table hold such info. Or, would you have had the tables FACULTY and STUDENT store the LAST, FIRST, ADDR and other fields? At what point do you de-normalize for the sake of being more practical?"

"One table name PERSON. Anything else is a denormalized nightmare."

"If the teachers are ignorant and can't think, what can we expect of their students?"

"This is very true, that is why I try to focus on people being able to solve problems. I figure if someone gets a degree from a good school and the degree is in some type of applied math or some other program like this, then basic problem solving should be there. I do push to hear more about that. I don’t care what tools they have learned I care what type of foundation they have. This is not very exact but it’s all you can do these days.

I was lucky. In my education a large focus was on mathematics so I feel that I did learn how to solve problems. Also in the CS classes I did take, one was on relational databases and we spend 80–90 percent of the time studying theory, where we used Date’s INTRODUCTION TO DATABASE SYSTEMS as our text. We spent a very small amount of time in the end looking at current tools -- at the time Ingres. Looking back that was the most valuable class I ever took. I still keep Date’s book next to me and use it very often in discussions with other people."

Fabian Pascal:
Yup, that used to be the case. No more. It's getting worse and worse and it shows.

Comments on republication

If you do not understand what's absurd about the comment, education on data fundamentals is warranted.

As is almost always the case, a set of tables is given that visualizes, supposedly, a set of database relations, without the conceptual model -- business rules that specify the entity groups with their properties/relationships -- which the relations are supposed to represent. The rules formalize as constraints on the relations necessary for faithfulness of the database to the model. Without all that it is impossible to assess correctness of design, of which full normalization is one component.

The point is that the top advantage of relational database management -- namely (1) semantic consistency and (2) logical validity -- is guaranteed, aside from a true RDBMS, by database design, the correctness of which can only be determined with reference to a well-defined and complete conceptual model. Whoever does not understand this fundamental is a practitioner, not professional.

An excellent case in point is the author of the comment, who has it exactly  upside down and backwards: a "denormalized nightmare" is what the single PERSON table visualizes (why)? The model likely involves a supertype-subtype relationship for which even the three-table design is incorrect (why?).

Continued in Part 2


Note: I will not publish or respond to anonymous comments. If you have something to say, stand behind it. Otherwise don't bother, it'll be ignored.



What Meaning Means: Business Rules, Predicates, Integrity Constraints and Database Consistency

Data and Meaning series

Relational Database Design, Logical Validity and Semantic Correctness


No comments:

Post a Comment

View My Stats