Comparison of data-serialization formats


This is a comparison of data serialization formats, various ways to convert complex objects to sequences of bits. It does not include markup languages used exclusively as document file formats.

Overview

NameCreator-maintainerBased onStandardized?SpecificationBinary?Human-readable?Supports references?Schema-IDL?Standard APIsSupports zero-copy operations
Apache ArrowApache Software FoundationC, C++, C#, Go, Java, JavaScript, Julia, Matlab, Python, R, Ruby, Rust, Swift
Apache AvroApache Software FoundationC, C#, C++, Java, PHP, Python, Ruby
Apache ParquetApache Software FoundationJava, Python, C++
Apache ThriftFacebook
Apache
C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, OCaml, Delphi and other languages
ASN.1ISO, IEC, ITU-TISO/IEC 8824 / ITU-T X.680 and ISO/IEC 8825 / ITU-T X.690 series. X.680, X.681, and X.683 define syntax and semantics.
BencodeBram Cohen
BitTorrent, Inc.
Part of
BSONMongoDBJSON
Cap'n ProtoKenton Varda
CBORCarsten Bormann, P. HoffmanMessagePackRFC 8949,
through tagging
Comma-separated values RFC author:
Yakov Shafranovich
RFC 4180
Common Data Representation Object Management GroupGeneral Inter-ORB ProtocolAda, C, C++, Java, Cobol, Lisp, Python, Ruby, Smalltalk
D-Bus Message Protocolfreedesktop.org
Efficient [XML Interchange] W3CXML, Efficient XML
Extensible Data Notation Rich Hickey / Clojure communityClojureClojure, Ruby, Go, C++, Javascript, Java, CLR, ObjC, Python
FlatBuffersGoogle
C++, Java, C#, Go, Python, Rust, JavaScript, PHP, C, Dart, Lua, TypeScript
Fast InfosetISO, IEC, ITU-TXMLITU-T X.891 and ISO/IEC 24824-1:2007
FHIRHealth Level 7REST basicsFast Healthcare Interoperability ResourcesHapi for FHIR JSON, XML, Turtle
IonAmazonJSONC, C#, Go, Java, JavaScript, Python, Rust
Java serializationOracle Corporation
JSONDouglas CrockfordJavaScript syntax/RFC 8259
,,
, but see BSON, Smile, UBJSON
, JSON-LD
MessagePackSadayuki FuruhashiJSON
NetstringsDan Bernstein
OGDLRolf Veen
OPC-UA BinaryOPC Foundation
OpenDDLEric LengyelC, PHP
PHP serialization formatPHP Group
Pickle (Python)Guido van RossumPython
Property listNeXT
Apple
,,,
Protocol Buffers Google,, and C++, Java, C#, Python, Go, Ruby, Objective-C, C, Dart, Perl, PHP, R, Rust, Scala, Swift, Julia, D, ActionScript, Delphi, Elixir, Elm, Erlang, GopherJS, Haskell, Haxe, JavaScript, Kotlin, Lua, Matlab, Mercurt, OCaml, Prolog, Solidity, TypeScript, Vala, Visual Basic
John McCarthy
Ron Rivest
Lisp, Netstrings Internet Draft, canonical representation, advanced transport representation
SmileTatu SalorantaJSON

SOAPW3CXML

Max WildgrubeRFC 3072
UBJSONThe Buzz Media, LLCJSON, BSON
eXternal Data Representation Sun Microsystems
IETF
/RFC 4506
XMLW3CSGML

XML-RPCDave WinerXML
YAMLClark Evans,
Ingy döt Net,
and Oren Ben-Kiki
C, Java, Perl, Python, Ruby, Email, HTML, MIME, URI, XML, SAX, SOAP, JSON
NameCreator-maintainerBased onStandardized?SpecificationBinary?Human-readable?Supports references?Schema-IDL?Standard APIsSupports zero-copy operations

Syntax comparison of human-readable formats

FormatNullBoolean trueBoolean falseIntegerFloating-pointStringArrayAssociative array/Object
ASN.1
truefalse6852306.8523015e+5
true

-42.1e7
A to Z
We said, "no".
An object :

true

1.85
Bob Peterson

A data mapping :


John
3.14


Jane
2.718


CSVnull
1
true
0
false
685230
-685230
6.8523015e+5
true,,-42.1e7,"A to Z"
42,1
A to Z,1,2,3
ednniltruefalse685230
-685230
6.8523015e+5"A to Z", "A \"up to\" Z"
Ion
null

null.null

null.bool

null.int

null.float

null.decimal

null.timestamp

null.string

null.symbol

null.blob

null.clob

null.struct

null.list

null.sexp
truefalse685230
-685230
0xA74AE
0b111010010101110
6.8523015e5"A to Z"

A
to
Z


Netstrings0:,
4:null,
1:1,
4:true,
1:0,
5:false,
6:685230,9:6.8523e+5,29:4:true,0:,7:-42.1e7,6:A to Z,,
JSONnulltruefalse685230
-685230
6.8523015e+5

OGDLnulltruefalse6852306.8523015e+5"A to Z"
'A to Z'
NoSpaces
true
null
-42.1e7
"A to Z"

42
true
"A to Z"
1
2
3

42
true
"A to Z",
OpenDDLref bool bool int32
int32
int32
float string
Homogeneous array:
int32 

Heterogeneous array:
array
dict
PHP serialization formatN;b:1;b:0;i:685230;
i:-685230;
d:685230.15;
d:INF;
d:-INF;
d:NAN;
s:6:"A to Z";a:4:Associative array:
a:2:
Object:
O:8:"stdClass":2:
Pickle (Python)N.I01\n.I00\n.I685230\n.F685230.15\n.S'A to Z'\n.<*BY><*BN><*I685230><*R6.8523015e+5>"A to Z"
Property list
6852306.8523015e+5

-42.1e7
A to Z

42

A to Z

1
2
3

Protocol Bufferstruefalse685230
-685230
20.0855369
field1: "value1"
field1: "value2"
field1: "value3

anotherfield 
anotherfield

thing1: "blahblah"
thing2: 18923743
thing3: -44
thing4
enumeratedThing: SomeEnumeratedValue
thing5: 123.456
: "etc"
: EnumValue
S-expressionsNIL
nil
T
#t
true
NIL
#f
false
6852306.8523015e+5abc
"abc"
#616263#
3:abc

|YWJj|
TOMLtruefalse685230
+685_230
-685230
0x_0A_74_AE
0b1010_0111_0100_1010_1110
6.8523015e+5
685.230_15e+03
685_230.15
inf
-inf
nan
"A to Z"
'A to Z'


42 = y
"A to Z" =
YAML~
null
Null
NULL
y
Y
yes
Yes
YES
on
On
ON
true
True
TRUE
n
N
no
No
NO
off
Off
OFF
false
False
FALSE
685230
+685_230
-685230
02472256
0x_0A_74_AE
0b1010_0111_0100_1010_1110
190:20:30
6.8523015e+5
685.230_15e+03
685_230.15
190:20:30.15
.inf
-.inf
.Inf
.INF
.NaN
.nan
.NAN
A to Z
"A to Z"
'A to Z'

- y
- -42.1e7
- A to Z

42: y
A to Z:
XML and SOAPtruefalse6852306.8523015e+5
true

-42.1e7
A to Z

true





XML-RPC106852306.8523015e+5A to Z

1
-42.1e7
A to Z



42
1


A to Z



1
2
3





Comparison of binary formats

FormatNullBooleansIntegerFloating-pointStringArrayAssociative array/object
ASN.1
type: : :Multiple valid types Data specifications and User definable type
BSON\x0A
True: \x08\x01
False: \x08\x00
int32: 32-bit little-endian 2's complement or int64: 64-bit little-endian 2's complementDouble: little-endian binary64UTF-8-encoded, preceded by int32-encoded string length in bytesBSON embedded document with numeric keysBSON embedded document
Concise Binary Object Representation \xf6

Efficient XML Interchange (EXI)
xsi:nil is not allowed in binary context.1–2 bit integer interpreted as boolean.Boolean sign, plus arbitrary length 7-bit octets, parsed until most-significant bit is 0, in little-endian. The schema can set the zero-point to any arbitrary number.
Unsigned skips the boolean flag.
Length prefixed integer-encoded Unicode. Integers may represent enumerations or string table entries instead.Length prefixed set of items.
FlatBuffersEncoded as absence of field in parent object
Little-endian 2's complement signed and unsigned 8/16/32/64 bitsUTF-8-encoded, preceded by 32-bit integer length of string in bytesVectors of any other type, preceded by 32-bit integer length of number of elementsTables or Vectors sorted by key
Ion\x0f\xbx Arbitrary length and overhead. Length in octets.
MessagePack\xc0Typecode + IEEE single/double
encoding is unspecified
NetstringsLength-encoded as an ASCII string + ':' + data + ','
Length counts only octets between ':' and ','
OGDL Binary
Property list
Protocol BuffersUTF-8-encoded, preceded by varint-encoded integer length of string in bytesRepeated value with the same tag or, for varint-encoded integers only, values packed contiguously and prefixed by tag and total byte length
Smile\x21IEEE single/double, BigDecimalLength-prefixed "short" Strings, marker-terminated "long" Strings and back-referencesArbitrary-length heterogenous arrays with end-markerArbitrary-length key/value pairs with end-marker
Structured Data eXchange Formats Big-endian signed 24-bit or 32-bit integerBig-endian IEEE doubleEither UTF-8 or ISO 8859-1 encodedList of elements with identical ID and size, preceded by array header with int16 lengthChunks can contain other chunks to arbitrary depth.
Thrift