C preprocessor
The C preprocessor is a text file processor that is used with C, C++ and other programming tools. The preprocessor provides for file inclusion, macro expansion, conditional compilation, and line control. Although named in association with C and used with C, the preprocessor capabilities are not inherently tied to the C language. It can be and is used to process other kinds of files.
C, C++, and Objective-C compilers provide a preprocessor capability, as it is required by the definition of each language. Some compilers provide extensions and deviations from the target language standard. Some provide options to control standards compliance. For instance, the GNU C preprocessor can be made more standards compliant by supplying certain command-line flags.
The C# programming language also allows for directives, though they are not read by a preprocessor and they cannot be used for creating macros, and are generally more intended for features such as conditional compilation. C# seldom requires the use of the directives, for example code inclusion does not require a preprocessor at all. Similarly, F# and Visual J# are able to call these C# preprocessor directives.
The Haskell programming language also allows the usage of the C preprocessor.
Features of the preprocessor are encoded in source code as directives that start with
#.Although C++ source files are often named with a
.cpp extension, that is an abbreviation for "C plus plus"; not C preprocessor.Preprocessor directives
The following languages have the following accepted directives.C/C++
The following tokens are recognised by the preprocessor in the context of preprocessor directives.-
#if -
#elif -
#else -
#endif -
#ifdef -
#ifndef -
#elifdef -
#elifndef -
#define -
#undef -
#include -
#embed -
#line -
#error -
#warning -
#pragma -
defined -
__has_include -
__has_cpp_attribute -
__has_c_attribute -
__has_embed
import, export, and module were partially handled by the preprocessor as well.The Haskell programming language also accepts C preprocessor directives, which is invoked by writing
C#
Although C#, F# and Visual J# do not have a separate preprocessor, these directives are processed as if there were one.-
#nullable -
#if -
#elif -
#else -
#endif -
#define -
#undef -
#region -
#endregion -
#error -
#warning -
#line -
#pragma
Objective-C
The following tokens are recognised by the preprocessor in the context of preprocessor directives.-
#if -
#elif -
#else -
#endif -
#ifdef -
#ifndef -
#define -
#undef -
#include -
#import -
#error -
#pragma -
definedHistory
#include and parameterless string replacement macros via #define. It was extended shortly after, firstly by Mike Lesk and then by John Reiser, to add arguments to macros and to support conditional compilation.The C preprocessor was part of a long macro-language tradition at Bell Labs, which was started by Douglas Eastwood and Douglas McIlroy in 1959.
Phases
Preprocessing is defined by the first four phases of translation specified in the C Standard.- Trigraph replacement: The preprocessor replaces trigraph sequences with the characters they represent. This phase was removed in C23 following the steps of C++17.
- Line splicing: Physical source lines that are continued with escaped newline sequences are spliced to form logical lines.
- Tokenization: The preprocessor breaks the result into preprocessing tokens and whitespace. It replaces comments with whitespace.
- Macro expansion and directive handling: Preprocessing directive lines, including file inclusion and conditional compilation, are executed. The preprocessor simultaneously expands macros and, since the 1999 version of the C standard, handles
_Pragmaoperators.Features
File inclusion
There are two directives in the C preprocessor for including contents of files:-
#include, used for directly including the contents of a file in-place -
#embed, used for directly including or embedding the contents of a binary resource in-placeCode inclusion
#include with the content of the file specified after the directive. The inclusion may be logical in the sense that the resulting content may not be stored on disk and certainly is not overwritten to the source file. The file being included need not contain any sort of code, as this directive will copy the contents of whatever file is included in-place, but the most typical use of #include is to include a header file.In the following example code, the preprocessor replaces the line
#include <stdio.h> with the content of the standard library header file named '' in which the function printf and other symbols are declared.- include
In this case, the file name is enclosed in angle brackets to denote that it is a system file. For a file in the codebase being built, double-quotes are used instead. The preprocessor may use a different search algorithm to find the file based on this distinction.
For C, a header file is usually named with a
.h extension. In C++, the convention for file extension varies with common extensions .h and .hpp. But the preprocessor includes a file regardless of the extension. In fact, sometimes code includes .c or .cpp files.To prevent including the same file multiple times, which often leads to a compiler error, a header file typically contains an guard or if supported by the preprocessor pragma once| to prevent multiple inclusion.
Binary resource inclusion
and C++26 introduce the#embed directive for binary resource inclusion, which allows including the content of a binary file into a source even if it is not valid C code.This allows binary resources to be included into a program without requiring processing by external tools like
xxd -i and without the use of string literals, which have a length limit on MSVC. Similarly to xxd -i, the directive is replaced by a comma separated list of integers corresponding to the data of the specified resource. More precisely, if an array of type is initialized using an #embed directive, the result is the same as-if the resource was written to the array using fread. Apart from the convenience, #embed is also easier for compilers to handle, since they are allowed to skip expanding the directive to its full form due to the as-if rule.The file to embed is specified the same as for
#include either with brackets or double quotes. The directive also allows certain parameters to be passed to it to customize its behavior. The C standard defines some parameters and implementations may define additional. The limit parameter is used to limit the width of the included data. It is mostly intended to be used with "infinite" files like urandom. The prefix and suffix parameters allow for specifying a prefix and suffix to the embedded data. Finally, the if_empty parameter replaces the entire directive if the resource is empty. All standard parameters can be surrounded by double underscores, just like standard attributes on C23, for example __prefix__ is interchangeable with prefix. Implementation-defined parameters use a form similar to attribute syntax but without the square brackets. While all standard parameters require an argument to be passed to them, this is generally optional and even the set of parentheses can be omitted if an argument is not required, which might be the case for some implementation-defined parameters.const unsigned char iconDisplayData = ;
// specify any type which can be initialized form integer constant expressions will do
const char resetBlob = ;
// attributes work just as well
alignas const signed char alignedDataString = ;
int main
Conditional compilation
is supported via the if-else core directives#if, #else, #elif, and #endif and with contraction directives #ifdef and #ifndef, which stand for and, respectively. In the following example code, the printf call is only included for compilation if VERBOSE is defined.- ifdef VERBOSE
- endif
The following demonstrates more complex logic:
- if ! || defined _WIN32 && !defined _WIN64
- else
- endif