YAML
YAML is a human-readable data serialization language. It is commonly used for configuration files and in applications where data is being stored or transmitted. YAML targets many of the same communications applications as Extensible Markup Language but has a minimal syntax that intentionally differs from Standard Generalized Markup Language. It uses Python-style indentation to indicate nesting and does not require quotes around most string values.
Custom data types are allowed, but YAML natively encodes scalars, lists, and associative arrays. These data types are based on the Perl programming language, though all commonly used high-level programming languages share very similar concepts. The colon-centered syntax, used for expressing key-value pairs, is inspired by electronic mail headers as defined in, and the document separator is borrowed from MIME. Escape sequences are reused from C, and whitespace wrapping for multi-line strings is inspired by HTML. Lists and hashes can contain nested lists and hashes, forming a tree structure; arbitrary graphs can be represented using YAML aliases. YAML is intended to be read and written in streams, a feature inspired by SAX.
Support for reading and writing YAML is available for many programming languages. Some source-code editors such as Vim, Emacs, and various integrated development environments have features that make editing YAML easier, such as folding up nested structures or automatically highlighting syntax errors.
The official recommended filename extension for YAML files has been since 2006. In 2024, the MIME type has been finalized.
History and name
YAML was first proposed by Clark Evans in 2001, who designed it together with Ingy döt Net and Oren Ben-Kiki. Originally YAML was said to mean Yet Another Markup Language, because it was released in an era that saw a proliferation of markup languages for presentation and connectivity. Its initial name was intended as a tongue-in-cheek reference to the technology landscape, referencing its purpose as a markup language with the yet another construct, but it was then repurposed between December 2001 and April 2002 as YAML Ain't Markup Language, a recursive acronym, to distinguish its purpose as data-oriented rather than document markup.Versions
Design
Syntax
A cheat sheet and full specification are available at the official site. The following is a synopsis of the basic elements.YAML accepts the entire Unicode character set, except for some control characters, and may be encoded in any one of UTF-8, UTF-16 or UTF-32.
- Whitespace indentation is used for denoting structure; however, tab characters are not allowed as part of that indentation.
- Comments begin with the number sign, can start anywhere on a line and continue until the end of the line. Comments must be separated from other tokens by whitespace characters. If characters appear inside of a string, then they are number sign literals.
- List members are denoted by a leading hyphen with one member per line.
- * A list can also be specified by enclosing text in square brackets with each entry separated by a comma.
- An associative array entry is represented using colon space in the form key: value with one entry per line. YAML requires the colon be followed by a space so that url-style strings like can be represented without needing to be enclosed in quotes.
- * A question mark can be used in front of a key, in the form "?key: value" to allow the key to contain leading dashes, square brackets, etc., without quotes.
- * An associative array can also be specified by text enclosed in curly braces, with keys separated from values by colon and the entries separated by commas.
- Strings are ordinarily unquoted, but may be enclosed in double-quotes, or single-quotes.
- * Within double-quotes, special characters may be represented with C-style escape sequences starting with a backslash. According to the documentation the only octal escape supported is.
- * Within single quotes the only supported escape sequence is a doubled single quote denoting the single quote itself as in.
- Block scalars are delimited with indentation with optional modifiers to preserve or fold newlines.
- Multiple documents within a single stream are separated by three hyphens.
- * Three periods optionally end a document within a stream.
- Repeated nodes are initially denoted by an ampersand and thereafter referenced with an asterisk.
- Nodes may be labeled with a type or tag using a double exclamation mark followed by a string, which can be expanded into a URI.
- YAML documents in a stream may be preceded by "directives" composed of a percent sign followed by a name and space-delimited parameters. Two directives are defined in YAML 1.1:
- * The %YAML directive is used for identifying the version of YAML in a given document.
- * The %TAG directive is used as a shortcut for URI prefixes. These shortcuts may then be used in node type tags.
Basic components
--- # Favorite movies
- Casablanca
- North by Northwest
- The Man Who Wasn't There
Optional inline format is delimited by comma+space and enclosed in brackets.
--- # Shopping list
Keys are separated from values by a colon+space. Indented blocks, common in YAML data files, use indentation and new lines to separate the key/value pairs. Inline blocks, common in YAML data streams, use comma+space to separate the key/value pairs between braces.
--- # Indented Block
name: John Smith
age: 33
--- # Inline Block
Strings do not require quotation marks. There are two ways to write multi-line strings, one preserving newlines and one that folds the newlines, both followed by a newline character.
data: |
There once was a tall man from Ealing
Who got on a bus to Darjeeling
It said on the door
"Please don't sit on the floor"
So he carefully sat on the ceiling
By default, the leading indentation and trailing whitespace are stripped, though other behavior can be explicitly specified.
data: >
Wrapped text
will be folded
into a single
paragraph
Blank lines denote
paragraph breaks
Folded text converts newlines to spaces and removes leading whitespace.
--- # The Smiths
-
- name: Mary Smith
age: 27
- : # sequences as keys are supported
--- # People, by gender
men:
women:
- Mary Smith
- Susan Williams
Objects and lists are important components in yaml and can be mixed. The first example is a list of key-value objects, all people from the Smith family. The second lists them by gender; it is a key-value object containing two lists.
Advanced components
Features that distinguish YAML from the capabilities of other data-serialization languages are structures, and data and composite keys.YAML structures enable storage of multiple documents within a single file, usage of references for repeated nodes, and usage of arbitrary nodes as keys.
For clarity, compactness, and avoiding data entry errors, YAML provides node anchors and references. References to the anchor work for all data types.
Below is an example of a queue in an instrument sequencer in which two steps are referenced without being fully described.
--- # Sequencer protocols for Laser eye surgery
- step: &id001 # defines anchor label &id001
instrument: Lasik 2000
pulseEnergy: 5.4
pulseDuration: 12
repetition: 1000
spotSize: 1mm
- step: &id002
instrument: Lasik 2000
pulseEnergy: 5.0
pulseDuration: 10
repetition: 500
spotSize: 2mm
- Instrument1: *id001 # refers to the first step
- Instrument2: *id002 # refers to the second step
Explicit data typing is seldom seen in the majority of YAML documents since YAML autodetects simple types. Data types can be divided into three categories: core, defined, and user-defined. Core are ones expected to exist in any parser. Many more advanced data types, such as binary data, are defined in the YAML specification but not supported in all implementations. Finally YAML defines a way to extend the data type definitions locally to accommodate user-defined classes, structures or primitives.
YAML autodetects the datatype of the entity, but sometimes one wants to cast the datatype explicitly. The most common situation is where a single-word string that looks like a number, Boolean or tag requires disambiguation by surrounding it with quotes or using an explicit datatype tag.
---
a: 123 # an integer
b: "123" # a string, disambiguated by quotes
c: 123.0 # a float
d: !!float 123 # also a float via explicit data type prefixed by
e: !!str 123 # a string, disambiguated by explicit type
f: !!str Yes # a string via explicit type
g: Yes # a Boolean True, string "Yes"
h: Yes we have No bananas # a string, "Yes" and "No" disambiguated by context.
Not every implementation of YAML has every specification-defined data type. These built-in types use a double-exclamation sigil prefix. Particularly interesting ones not shown here are sets, ordered maps, timestamps, and hexadecimal. Here is an example of base64-encoded binary data.
---
picture: !!binary |
R0lGODdhDQAIAIAAAAAAANn
Z2SwAAAAADQAIAAACF4SDGQ
ar3xxbJ9p0qa7R0YxwzaFME
1IAADs=
Many implementations of YAML can support user-defined data types for object serialization. Local data types are not universal data types but are defined in the application using the YAML parser library. Local data types use a single exclamation mark.
YAML supports composite keys, which consist of multiple values. Such keys are useful for coordinate transformations, multi-field identifiers, test cases with compound conditions, and the like.
--- # Transform between two systems of coordinates
transform:
:
: