Normalization is the process of organizing data in a database to reduce redundancy and improve data integrity.
Redundant data refers to duplicate data that is stored in multiple places within a database. For instance, if you have two tables:
and the following structure:
Users Table
UserID | Name | Username |
---|---|---|
1 | Alice Smith | asmith |
2 | Bob Johnson | bjohnson |
Orders Table (with redundancy)
OrderID | UserID | Name | Product | Quantity |
---|---|---|---|---|
101 | 1 | Alice Smith | Laptop | 1 |
102 | 2 | Bob Johnson | Smartphone | 2 |
In this example, the Orders
table includes the user's name, even though the same information is already present in the Users
table. This is redundant data.
Despite the redundancy, this schema can still function correctly. Given a user ID, we can retrieve all orders associated with that user. Similarly, we can find the user's details (username, name) using an order ID.
Users Table
UserID | Name | Username |
---|---|---|
1 | Alice Smith | asmith |
2 | Bob Johnson | bjohnson |