Tab-separated values


Tab-separated values is a plain text data format for storing tabular data where the values of a record are separated by a tab character and each record is a line. The TSV format is a form of delimiter-separated values and is similar to the commonly-used comma-separated values format.
TSV is a relatively simple format and is widely supported for data exchange by software that generally deals with tabular data. For example, a TSV file might be used to transfer information from a database to a spreadsheet.

Example

The following are records of the Iris flower [data set] in TSV format. Since a tab is not a printable character, an arrow is used for demonstration here to denote a tab character.
The following is the same data rendered as a table.
Sepal lengthSepal widthPetal lengthPetal widthSpecies
5.13.51.40.2I. setosa
4.93.01.40.2I. setosa
4.73.21.30.2I. setosa
4.63.11.50.2I. setosa
5.03.61.40.2I. setosa

If a text editor that supports Dynamic tab stops is used to view the contents of a TSV file, the layout will look like the table rendering just without cell borders and header row formatting.

Delimiter collision

As a form of delimiter collision, if a field contained a tab character, the data format would become meaningless since tabs were no longer only used between fields. To prevent this situation, the IANA media type standard for TSV simply disallows a tab within a field. Similarly, a value cannot contain a line terminator. To represent a value with an embedded tab or line terminator character, a commonly-used mechanism is to replace the character with the corresponding escape sequence as shown in the following table.
sequencerepresents
\ttab
\nline feed
\rcarriage return
\\backslash

Another commonly-used convention, borrowed from CSV, is to enclose a value that contains a tab or line terminator character in quotes.

Line terminator

As for any text file, the character used for line terminator varies. On a Microsoft-based system, normally it's a carriage return and line feed sequence. On a Unix-based system, it's just LF. The de-facto specification uses the term "EOL" which is an ambiguous term like line terminator and newline. Software often is designed to either handle the line terminator for the platform on which it runs or to handle either terminator.