Normalization

Database normalization is the process of organizing data in a database to reduce redundancy and improve data integrity. The primary levels of normalization are known as Normal Forms (NFs). Here are the main Normal Forms along with their definitions and examples:

1. First Normal Form (1NF)

Definition: A table is in 1NF if:

  • It only contains atomic (indivisible) values.
  • Each column contains values of a single type.
  • Each column has a unique name.
  • The order in which data is stored does not matter.

Example: Consider a table of students with subjects they are enrolled in.

Not in 1NF:

StudentIDStudentNameSubjects
1AliceMath, English
2BobScience, History

In 1NF:

StudentIDStudentNameSubject
1AliceMath
1AliceEnglish
2BobScience
2BobHistory

2. Second Normal Form (2NF)

Definition: A table is in 2NF if:

  • It is in 1NF.
  • All non-key attributes are fully functional dependent on the primary key (no partial dependency).

Example: Consider a table of student enrollments with a composite primary key.

Not in 2NF:

StudentIDCourseIDStudentNameCourseName
1101AliceMath
1102AliceEnglish
2101BobMath
2103BobHistory

In 2NF:

Separate into two tables: Students

StudentIDStudentName
1Alice
2Bob

Courses

CourseIDCourseName
101Math
102English
103History

Enrollments

StudentIDCourseID
1101
1102
2101
2103

3. Third Normal Form (3NF)

Definition: A table is in 3NF if:

  • It is in 2NF.
  • All the attributes are functionally dependent only on the primary key (no transitive dependency).

Example: Consider a table with student information and department details.

Not in 3NF:

StudentIDStudentNameDeptIDDeptName
1Alice10Computer Sci
2Bob20Physics

In 3NF:

Separate into two tables: Students

StudentIDStudentNameDeptID
1Alice10
2Bob20

Departments

DeptIDDeptName
10Computer Sci
20Physics

4. Boyce-Codd Normal Form (BCNF)

Definition: A table is in BCNF if:

  • It is in 3NF.
  • For every non-trivial functional dependency X Y, X is a super key.

Example: Consider a table where an instructor can only teach one subject.

Not in BCNF:

InstructorIDSubjectIDInstructorNameSubjectName
1101JohnMath
2102AliceEnglish
1103JohnHistory

In BCNF:

Separate into two tables: Instructors

InstructorIDInstructorName
1John
2Alice

Subjects

SubjectIDSubjectNameInstructorID
101Math1
102English2
103History1

5. Fourth Normal Form (4NF)

Definition: A table is in 4NF if:

  • It is in BCNF.
  • It has no multi-valued dependencies.

Example: Consider a table where a student can have multiple hobbies and multiple phone numbers.

Not in 4NF:

StudentIDHobbyPhoneNumber
1Painting1234567890
1Singing1234567890
1Painting0987654321

In 4NF:

Separate into two tables: StudentHobbies

StudentIDHobby
1Painting
1Singing

StudentPhones

StudentIDPhoneNumber
11234567890
10987654321

6. Fifth Normal Form (5NF)

Definition: A table is in 5NF if:

  • It is in 4NF.
  • It cannot be decomposed into smaller tables without losing data (no join dependency).

Example: Consider a table where projects have multiple tasks and each task can be handled by multiple teams.

Not in 5NF:

ProjectIDTaskIDTeamID
1T1A
1T1B
1T2A

In 5NF:

Separate into three tables: ProjectsTasks

ProjectIDTaskID
1T1
1T2

TasksTeams

TaskIDTeamID
T1A
T1B
T2A

ProjectsTeams

ProjectIDTeamID
1A
1B