Normalization, Redundancy, and Data Integrity in Databases

Normalization

Normalization is the process of organizing data in a database to reduce redundancy and improve data integrity.

Redundancy

Redundant data refers to duplicate data that is stored in multiple places within a database. For instance, if you have two tables:

Users
Orders

and the following structure:

Users Table

UserID	Name	Username
1	Alice Smith	asmith
2	Bob Johnson	bjohnson

Orders Table (with redundancy)

OrderID	UserID	Name	Product	Quantity
101	1	Alice Smith	Laptop	1
102	2	Bob Johnson	Smartphone	2

In this example, the Orders table includes the user's name, even though the same information is already present in the Users table. This is redundant data.

Despite the redundancy, this schema can still function correctly. Given a user ID, we can retrieve all orders associated with that user. Similarly, we can find the user's details (username, name) using an order ID.

Non-Full Proof Data

Users Table

UserID	Name	Username
1	Alice Smith	asmith
2	Bob Johnson	bjohnson