Database refactoring
A database refactoring is a simple change to a database schema that improves its design while retaining both its behavioral and informational semantics. Database refactoring does not change the way data is interpreted or used and does not fix bugs or add new functionality. Every refactoring to a database leaves the system in a working state, thus not causing maintenance lags, provided the meaningful data exists in the production environment.
A database refactoring is conceptually more difficult than a code refactoring; code refactorings only need to maintain behavioral semantics while database refactorings also must maintain informational semantics.
A database schema is typically refactored for one of several reasons:
- To develop the schema in an evolutionary manner in parallel with the evolutionary design of the rest of the system.
- To fix design problems with an existing legacy database schema. Database refactorings are often motivated by the desire for database normalization of an existing production database, typically to "clean up" the design of the database.
- To implement what would be a large change as a series of small, low-risk changes.
Categories of database refactoring
In 2006 Scott Ambler, Pramod Sadalage describe the following categories of database refactoring:- Architecture Refactoring
- Structural Refactoring
- Data Quality Refactoring
- Referential Integrity Refactoring
- Transformation
- Method Refactoring
In 2019 Vladislav Struzik supplemented the categories of database refactoring with a new one:
- Access Refactoring
Process of database refactoring
The process of database refactoring is the act of applying database refactorings to evolve an existing database schema. There are three considerations that need to be taken into account:- How a single refactoring is implemented
- How database refactorings are tracked and shared within organizations
- How a series of database refactorings are applied