Tables, Full Normalization and Business Rules

Follow @DBDebunk Follow @ThePostWest

REVISED: 10/16/16

Often somebody produces a table and asks if it is fully normalized (in 5NF) and, if not, in what normal form it is. This is an indication of poor grasp of data fundamentals.

Consider ASSIGNMENTS:

EMP# ENAME    PROJECT       DEPT#
===================================
100 Spenser Sys Support   E21
100 Spenser Comp Svcs     E21
100 Spenser Supp Svcs     E21
160 Pianka   Info Center   D11
310 Setright Documentation D11
310 Setright Mfg Systems   D11
150 Adamson Info Center   D11
-----------------------------------

First, a normal form is a property of a relation, not a table (a R-table is only a "visual shorthand" for a relation -- a special kind of table that visualizes a relation on some physical medium (e.g., paper) -- and the two should not be confused.

Second, the normal form of a relation is determined from attribute dependencies. Formally, a relation is fully normalized (in 5NF) if and only if the only dependencies that hold in it are functional dependencies (FD) of the non-key attributes on the key (i.e., there is exactly one value of each non-key attribute for every key value, but not vice-versa). Since a key represents an entity identifier, this condition exists only when, informally, a relation represents entities of a single type (why?) Is this true for the relation pictured by ASSIGNMENTS?

The fact is that whether a relation represents a single type of entity -- and, therefore, is fully normalized -- cannot be ascertained from sheer visual inspection of the table picturing it. It requires knowledge of what the underlying relation means, namely the type(s) of entity specified by the business rules in the corresponding conceptual model that the relation represents.

For example, if the rules:

R1: Every employee is identified by an employee number.
R2: Every employee has an employee name.
R3: Every employee works in a department.
R4: Every project assignment is identified by an employee number and a project name.

model two types of entity:

Employees: {emp. number} --> {employee name, department number}
Project assignments: {employee number, project}

then the relation represents both and, consequently, is not in 5NF. In fact, without a well defined and complete conceptual specification of entity types, you can't even tell whether relations have keys (if they do not, they are not relation) and, if they do, what the key is.

Database design adhering to the Principle of Full Normalization (POFN) do not "bundle" entity types and produce 5NF relations, obviating the need for further normalization. For the advantages of full normalization and the drawbacks of "denormalization for performance" illusion, see paper #2 and the recently published DBDEBUNK GUIDE TO MISCONCEPTIONS ABOUT DATA FUNDAMENTALS.

Explicit further normalization is necessary only to "repair" poorly designed non-5NF relations by replacing them 5NF projections. For example, ASSIGNMENTS with EMPLOYEES and PROJ_ASSIGNS pictured by R-tables:

EMP# ENAME    DEPT#
====================
100 Spenser E21
160 Pianka   D11
310 Setright D11
150 Adamson D11
--------------------

EMP# PROJECT
=================
100 Sys Support
100 Comp Svcs
100 Supp Svcs
160 Info Center
310 Mfg Systems
310 Comp Svcs
150 Info Center
------------------

each representing a single entity type. The repair is possible because the following holds:

ASSIGNMENTS{EMP#,ENAME, DEPT#} JOIN ASSIGNMENTS{EMP#,PROJECT} = PROJ_ASSIGNMENTS

where the left-hand side is a join of two projections of ASSIGNMENTS (i.e., no information is lost).

POSTS

Sunday, April 28, 2013

Tables, Full Normalization and Business Rules

No comments:

Post a Comment