Third normal form
Third normal form is a level of database normalization defined by English computer scientist Edgar F. Codd. A relation is in third normal form if it is in second normal form and also lacks non-key dependencies, meaning that no non-prime attribute is functionally dependent on any other non-prime attribute. In other words, each non-prime attribute must depend solely and non-transitively on each candidate key. William Kent summarised 3NF with the dictum that "a non-key field must provide a fact about the key, the whole key, and nothing but the key".
An example of a violation of 3NF would be a Patient relation with the attributes PatientID, DoctorID and DoctorName, in which DoctorName would depend first and foremost on DoctorID and only transitively on the key, PatientID. Such a design would cause a doctor's name to be redundantly duplicated across each of their patients. A database compliant with 3NF would store doctors' names in a separate Doctor relation which Patient could reference via a foreign key.
3NF was defined, along with 2NF, in Codd's paper "Further Normalization of the Data Base Relational Model" in 1971, which came after 1NF's definition in "A Relational Model of Data for Large Shared Data Banks" in 1970. 3NF was itself followed by the definition of Boyce–Codd normal form in 1974, which seeks to prevent anomalies possible in relations with several overlapping composite keys.
Definition of third normal form
Codd's definition states that a relation R is in 3NF if and only if it is in second normal form and every non-prime attribute of R is non-transitively dependent on each candidate key. A non-prime attribute of R is an attribute that does not belong to any candidate key of R.Codd defines a transitive dependency of an attribute set Z on an attribute set X as a functional dependency chain X → Y → Z that must be satisfied for some attribute set Y, where it is not the case that Y → X, and all three sets must be disjoint.
A 3NF definition that is equivalent to Codd's, but expressed differently, was given by Carlo Zaniolo in 1982. This definition states that a table is in 3NF if and only if for each of its functional dependencies X → Y, at least one of the following conditions holds:X contains Y,X is a superkey,
- every element of Y \ X, the set difference between Y and X, is a prime attribute.
The definition offered by Zaniolo can be shown to be equivalent to the Codd definition in the following way: let X → A be a nontrivial functional dependency and let A be a non-prime attribute. Also let Y be a candidate key of R. Then Y → X. Further since A is a non-prime attribute, therefore A cannot determine X because in that case AY would form the super key. Therefore, A is not transitively dependent on Y if and only if there is a functional dependency X → Y, i.e., if and only if X is a superkey of R. It is to be noted that either or each of A, X and Y can be single attributes or a combination thereof but are necessarily disjoint. One can write X → Y equivalently as X → XY and one may thus observe the Zaniolo equivalence for Codd by performing the set difference between the dependent and the determinant.
Example
Design which violates 3NF
The following relation, with the composite key, fails to meet the requirements of 3NF. The non-prime attributes WinnerName and WinnerBirthdate are only transitively dependent on the composite key via their dependence on the non-prime attribute WinnerID. This creates redundancy and the potential for inconsistency in the case that a winner of multiple tournaments is accidentally given different dates of birth in different tuples.| Name | Year | WinnerID | WinnerName | WinnerBirthdate |
| Indiana Invitational | 1998 | 1 | Al Fredrickson | 1975-07-21 |
| Cleveland Open | 1999 | 2 | Bob Albertson | 1968-09-28 |
| Des Moines Masters | 1999 | 1 | Al Fredrickson | 1975-07-21 |
| Indiana Invitational | 1999 | 3 | Chip Masterson | 1977-03-14 |
Design which complies with 3NF
To bring the relation into compliance with 3NF, WinnerID, WinnerName and WinnerBirthdate can be transferred to a separate table.| Name | Year | WinnerID |
| Indiana Invitational | 1998 | 1 |
| Cleveland Open | 1999 | 2 |
| Des Moines Masters | 1999 | 1 |
| Indiana Invitational | 1999 | 3 |
| WinnerID | Name | Birthdate |
| 1 | Al Fredrickson | 1975-07-21 |
| 2 | Bob Albertson | 1968-09-28 |
| 3 | Chip Masterson | 1977-03-14 |
Tournament's WinnerID attribute now acts as a foreign key referencing the primary key of Winner. Unlike before, it is not possible for a winner to be associated with multiple dates of birth.
"Nothing but the key"
A paraphrase of Codd's definition of 3NF parodying the traditional oath to tell the truth in a court of law was given by William Kent: "a non-key field must provide a fact about the key, the whole key, and nothing but the key". Requiring that non-key attributes be dependent on "the whole key" ensures compliance with 2NF, and further requiring their dependency on "nothing but the key" ensures compliance with 3NF. A common variation supplements the paraphrase with the addendum "so help me Codd".While the phrase is a useful mnemonic, the mention of only a single key makes fulfilling it necessary but not sufficient to satisfy 2NF and 3NF, both of which are concerned with all candidate keys of a relation and not just any one.
Christopher J. Date notes that, adapted to refer to all fields rather than just non-key fields, the summary can also encompass the slightly stronger Boyce–Codd normal form, in which prime attributes must not be functionally dependent at all. Prime attributes are considered to provide a fact about the key in the sense of providing part or all of the key itself.
Computation
A relation can always be decomposed in third normal form, that is, the relation R is rewritten to projections R1,..., Rn whose join is equal to the original relation. Further, this decomposition does not lose any functional dependency, in the sense that every functional dependency on R can be derived from the functional dependencies that hold on the projections R1,..., Rn. What is more, such a decomposition can be computed in polynomial time.To decompose a relation into 3NF from 2NF, break the table into the canonical cover functional dependencies, then create a relation for every candidate key of the original relation which was not already a subset of a relation in the decomposition.