Normalizing the Data [ ALLBASE/SQL Database Administration Guide ] MPE/iX 5.5 Documentation
ALLBASE/SQL Database Administration Guide
Normalizing the Data
Normalization is a process that helps you arrange data into tables. The
goal of normalization is to reduce data redundancy and facilitate
efficient updates. In theory, a perfectly normalized database uses data
at maximum efficiency and minimum redundancy. In practice, however,
normalization does not take into consideration the specific applications
of a database. Consequently, normalization is recommended only as a
guideline for database design. Use normalization to get your data into
relational format (two-dimensional tables) and then alter the table
format to fit the specific needs of users and applications.
The process of normalization entails examining the data and the table
format of a database against each of the criteria for each of the normal
forms. Although there are at least five normal forms, most database
tables need to be processed only to the second or third normal form.
Therefore, only the first three normal forms are discussed here. When
data is normalized, each lower normal form is a prerequisite to the next
higher normal form. In other words, data must be in first normal form
before it can be put into second normal form.
First Normal Form
In first normal form, data contains rows that have the same number of
columns. Unnormalized data is not so orderly. For example, some
purchase orders contain one order item, others contain two or more.
Thus, if each order number from the purchase order in Figure 2-1 were
to occupy a single row, each row would have a different number of
columns. By separating the data into three groups as in Figure 2-3 ,
repeated categories are removed, leaving each row with the same number of
columns.
Figure 2-3. First Normal Form: Removing Repeating Groups
Second Normal Form
In second normal form, shown in Figure 2-4 , all attributes are
functionally dependent on the primary key.
Figure 2-4. Second Normal Form: Establishing Dependency
Attributes are functionally dependent when their values depend directly
upon the value of the primary key. The primary key in the Order Item
information is the order and item number combination. Each such
combination represents a unique row.
Since the vendor part number and the order quantity relate to a
particular order and item number, they are functionally dependent on the
primary key. However, the unit price depends on the order quantity
(which, in turn, depends on the order and item number), but it does not
depend directly on order or item number. This is called a transitive
dependency.
Third Normal Form
In third normal form, data does not have transitive dependencies. In
Figure 2-5 , unit price is removed from the Order Items information
and placed in a new set of information called Supply Price, which is used
to calculate the price of an item dependent on the quantity ordered and
date delivered.
Figure 2-5. Third Normal Form: Removing Transitive Dependencies
Third normal form relieves update problems. For example, in second
normal form, when an item's price changes, all orders containing that
item would have to be altered. In third normal form, only the rows for
that item in the Supply Price table need to be changed.
Normalized data provides a starting point for table definition. Note
that the data in Figure 2-3 , Figure 2-4 , and Figure 2-5 is in
tabular form. The next step in table design is to take the normalized
data and fit it into ALLBASE/SQL columns, tables, views, and indexes.
MPE/iX 5.5 Documentation