Conventional Commits Specification
Conventional Commits Specification is the specification formalizing the categorization of commits in version control systems. This classification system distinguishes code changes based on their purpose—such as features, bug fixes, or documentation updates—to facilitate automated processes like changelog generation and semantic versioning.
Background
Modern distributed software development relies heavily on commit messages to track changes. While early classification frameworks, such as the one proposed by Swanson in 1976, categorized maintenance into three types, modern development has shifted toward finer-grained taxonomies.The Conventional Commits Specification is a widely adopted standard that requires commit messages to follow a specific format:
:
The mandatory field categorizes the commit into one of ten distinct classes, making the history machine-readable. Additionally, the footer section is frequently utilized to explicitly mark breaking changes using the token, a feature relied upon by developers and automated tools to identify backwards-incompatible updates.
Classification types
Research into CCS usage has identified ten primary categories used to classify commits. To address ambiguity found in earlier definitions, the following definitions have been proposed to minimize overlap:- Feature : Changes that introduce new functionality to the codebase. This includes both user-oriented and developer-oriented features.
- Fix : Changes that resolve bugs or faults.
- Performance : Modifications aimed specifically at improving performance without changing behavior.
- Style : Changes that improve code readability without altering meaning.
- Refactor : Restructuring code to improve maintainability without changing external behavior. This category explicitly excludes changes that strictly fall under "style" or "perf".
- Documentation : Modifications to documentation or text files.
- Test : Adding or updating test files.
- Continuous Integration : Changes to CI configuration files and scripts.
- Build : Modifications affecting the build system or external dependencies.
- Chore : Miscellaneous tasks that do not fit into the other categories.
Adoption and usage
The adoption of CCS has seen a consistent increase in the open-source community, though rates vary by ecosystem and methodology. A 2025 study analyzing over 3,000 top GitHub projects found that 116 projects had explicitly declared their adoption of CCS in documentation. Projects typically adopt CCS in one of two modes:- Document Declaration: Explicitly stating the convention in contributing guidelines.
- Integrated Automation: Using tools like or GitHub Actions to enforce the format.
Challenges
Developers face several challenges when manually classifying commits according to CCS. A qualitative analysis of developer discussions on GitHub and Stack Overflow identified four main issues:- Type Confusion: The most prevalent challenge, where developers are unsure which type applies. Common confusion exists between vs. and overlapping definitions of,, and.
- Type Aliases: Requests to use synonyms, such as "patch" instead of "fix".
- Changing Types: Requests to add new types or remove existing ones.
- Lack of Definitions: Calls for a comprehensive, standardized list of definitions, as the official specification often defers to Angular's guidelines, which some developers find ambiguous.
Automated classification
Recent approaches utilizing Large Language Models, specifically fine-tuned models like CodeLlama, have demonstrated superior performance. A fine-tuned CodeLlama model achieved a macro F1 score of roughly 76%, outperforming both BERT and GPT-4 in correctly classifying commits into the ten CCS types. The categories of and remain the hardest to classify automatically due to their broad or residual definitions.