Codebase

A codebase is a collection of source code that is maintained as a unit. Typically, it can be used to build one or more software components including applications and libraries.
A codebase is typically stored in a source control repository of a version control system. A repository can contain build-generated files, but typically such files are excluded from a repository, and therefore the codebase. A repository may contain data files that are required for building or running the resulting software. But version control is not a required aspect of a codebase. Even the Linux kernel was maintained without version control for many years.
When developing multiple components, a choice is made either to maintain a separate, distinct codebase for each, or to combine codebases, possibly in a single,. With a monolithic codebase, changes to multiple components can often be easier and robust. But this requires a larger repository, and makes it easier to introduce wide-ranging technical debt. With separate codebases, each repository is smaller and more manageable. The structure enforces logical separation between components, but can require more build and runtime integration between codebases, and complicates changes that span multiple components.

Examples

Some notably large codebases include:

Google: monolithic, 1 billion files, 9 million source code files, 2 billion lines of source code, 35 million commits in total, 86 TB total size
Facebook: monolithic, 8 GB, hundreds of thousands of files
Linux kernel: distributed, over 15 million lines of code