Comparison of data-serialization formats
This is a comparison of data serialization formats, various ways to convert complex objects to sequences of bits. It does not include markup languages used exclusively as document file formats.
Overview
| Name | Creator-maintainer | Based on | Standardized? | Specification | Binary? | Human-readable? | Supports references? | Schema-IDL? | Standard APIs | Supports zero-copy operations |
| Apache Arrow | Apache Software Foundation | C, C++, C#, Go, Java, JavaScript, Julia, Matlab, Python, R, Ruby, Rust, Swift | ||||||||
| Apache Avro | Apache Software Foundation | C, C#, C++, Java, PHP, Python, Ruby | ||||||||
| Apache Parquet | Apache Software Foundation | Java, Python, C++ | ||||||||
| Apache Thrift | Facebook Apache | C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, OCaml, Delphi and other languages | ||||||||
| ASN.1 | ISO, IEC, ITU-T | ISO/IEC 8824 / ITU-T X.680 and ISO/IEC 8825 / ITU-T X.690 series. X.680, X.681, and X.683 define syntax and semantics. | ||||||||
| Bencode | Bram Cohen BitTorrent, Inc. | Part of | ||||||||
| BSON | MongoDB | JSON | ||||||||
| Cap'n Proto | Kenton Varda | |||||||||
| CBOR | Carsten Bormann, P. Hoffman | MessagePack | RFC 8949 | , through tagging | ||||||
| Comma-separated values | RFC author: Yakov Shafranovich | RFC 4180 | ||||||||
| Common Data Representation | Object Management Group | General Inter-ORB Protocol | Ada, C, C++, Java, Cobol, Lisp, Python, Ruby, Smalltalk | |||||||
| D-Bus Message Protocol | freedesktop.org | |||||||||
| Efficient [XML Interchange] | W3C | XML, Efficient XML | ||||||||
| Extensible Data Notation | Rich Hickey / Clojure community | Clojure | Clojure, Ruby, Go, C++, Javascript, Java, CLR, ObjC, Python | |||||||
| FlatBuffers | C++, Java, C#, Go, Python, Rust, JavaScript, PHP, C, Dart, Lua, TypeScript | |||||||||
| Fast Infoset | ISO, IEC, ITU-T | XML | ITU-T X.891 and ISO/IEC 24824-1:2007 | |||||||
| FHIR | Health Level 7 | REST basics | Fast Healthcare Interoperability Resources | Hapi for FHIR JSON, XML, Turtle | ||||||
| Ion | Amazon | JSON | C, C#, Go, Java, JavaScript, Python, Rust | |||||||
| Java serialization | Oracle Corporation | |||||||||
| JSON | Douglas Crockford | JavaScript syntax | /RFC 8259 ,, | , but see BSON, Smile, UBJSON | , JSON-LD | |||||
| MessagePack | Sadayuki Furuhashi | JSON | ||||||||
| Netstrings | Dan Bernstein | |||||||||
| OGDL | Rolf Veen | |||||||||
| OPC-UA Binary | OPC Foundation | |||||||||
| OpenDDL | Eric Lengyel | C, PHP | ||||||||
| PHP serialization format | PHP Group | |||||||||
| Pickle (Python) | Guido van Rossum | Python | ||||||||
| Property list | NeXT Apple | ,,, | ||||||||
| Protocol Buffers | ,, and | C++, Java, C#, Python, Go, Ruby, Objective-C, C, Dart, Perl, PHP, R, Rust, Scala, Swift, Julia, D, ActionScript, Delphi, Elixir, Elm, Erlang, GopherJS, Haskell, Haxe, JavaScript, Kotlin, Lua, Matlab, Mercurt, OCaml, Prolog, Solidity, TypeScript, Vala, Visual Basic | ||||||||
| John McCarthy Ron Rivest | Lisp, Netstrings | Internet Draft | , canonical representation | , advanced transport representation | ||||||
| Smile | Tatu Saloranta | JSON | ||||||||
| SOAP | W3C | XML | ||||||||
| Max Wildgrube | RFC 3072 | |||||||||
| UBJSON | The Buzz Media, LLC | JSON, BSON | ||||||||
| eXternal Data Representation | Sun Microsystems IETF | /RFC 4506 | ||||||||
| XML | W3C | SGML | ||||||||
| XML-RPC | Dave Winer | XML | ||||||||
| YAML | Clark Evans, Ingy döt Net, and Oren Ben-Kiki | C, Java, Perl, Python, Ruby, Email, HTML, MIME, URI, XML, SAX, SOAP, JSON | ||||||||
| Name | Creator-maintainer | Based on | Standardized? | Specification | Binary? | Human-readable? | Supports references? | Schema-IDL? | Standard APIs | Supports zero-copy operations |
Syntax comparison of human-readable formats
| Format | Null | Boolean true | Boolean false | Integer | Floating-point | String | Array | Associative array/Object | |||||||
| ASN.1 | | | | | An object : A data mapping : | ||||||||||
| CSV | null | 1true | 0false | 685230-685230 | 6.8523015e+5 | | true,,-42.1e7,"A to Z" | 42,1 | |||||||
| edn | nil | true | false | 685230-685230 | 6.8523015e+5 | "A to Z", "A \"up to\" Z" | | | |||||||
| Ion | nullnull.nullnull.boolnull.intnull.floatnull.decimalnull.timestampnull.stringnull.symbolnull.blobnull.clobnull.structnull.listnull.sexp | true | false | 685230-6852300xA74AE0b111010010101110 | 6.8523015e5 | "A to Z" | |||||||||
| Netstrings | 0:,4:null, | 1:1,4:true, | 1:0,5:false, | 6:685230, | 9:6.8523e+5, | 29:4:true,0:,7:-42.1e7,6:A to Z,, | |||||||||
| JSON | null | true | false | 685230-685230 | 6.8523015e+5 | ||||||||||
| OGDL | null | true | false | 685230 | 6.8523015e+5 | "A to Z"'A to Z'NoSpaces | true | 42 42 | |||||||
| OpenDDL | ref | bool | bool | int32 int32 int32 | float | string | Homogeneous array: int32 Heterogeneous array: array | dict | |||||||
| PHP serialization format | N; | b:1; | b:0; | i:685230;i:-685230; | d:685230.15;d:INF;d:-INF;d:NAN; | s:6:"A to Z"; | a:4: | Associative array:a:2:Object: O:8:"stdClass":2: | |||||||
| Pickle (Python) | N. | I01\n. | I00\n. | I685230\n. | F685230.15\n. | S'A to Z'\n. | | <*BY> | <*BN> | <*I685230> | <*R6.8523015e+5> | "A to Z" | | ||
| Property list | | | | | |||||||||||
| Protocol Buffers | true | false | 685230-685230 | 20.0855369 | | field1: "value1" anotherfield | thing1: "blahblah" thing2: 18923743 thing3: -44 thing4 enumeratedThing: SomeEnumeratedValue thing5: 123.456 : "etc" : EnumValue | ||||||||
| S-expressions | NILnil | T#ttrue | NIL#ffalse | 685230 | 6.8523015e+5 | abc"abc"#616263#3:abc | | | |||||||
| TOML | true | false | 685230+685_230-6852300x_0A_74_AE0b1010_0111_0100_1010_1110 | 6.8523015e+5685.230_15e+03685_230.15inf-infnan | "A to Z"'A to Z' | | 42 = y | ||||||||
| YAML | ~nullNullNULL | yYyesYesYESonOnONtrueTrueTRUE | nNnoNoNOoffOffOFFfalseFalseFALSE | 685230+685_230-685230024722560x_0A_74_AE0b1010_0111_0100_1010_1110190:20:30 | 6.8523015e+5685.230_15e+03685_230.15190:20:30.15.inf-.inf.Inf.INF.NaN.nan.NAN | A to Z"A to Z"'A to Z' | - y | 42: y | |||||||
| XML and SOAP | true | false | 685230 | 6.8523015e+5 | |||||||||||
| XML-RPC | | | | | |
Comparison of binary formats
| Format | Null | Booleans | Integer | Floating-point | String | Array | Associative array/object |
| ASN.1 | type | : | : | : | Multiple valid types | Data specifications and | User definable type |
| BSON | \x0A | True: \x08\x01False: \x08\x00 | int32: 32-bit little-endian 2's complement or int64: 64-bit little-endian 2's complement | Double: little-endian binary64 | UTF-8-encoded, preceded by int32-encoded string length in bytes | BSON embedded document with numeric keys | BSON embedded document |
| Concise Binary Object Representation | \xf6 | ||||||
| Efficient XML Interchange (EXI) | xsi:nil is not allowed in binary context. | 1–2 bit integer interpreted as boolean. | Boolean sign, plus arbitrary length 7-bit octets, parsed until most-significant bit is 0, in little-endian. The schema can set the zero-point to any arbitrary number. Unsigned skips the boolean flag. | Length prefixed integer-encoded Unicode. Integers may represent enumerations or string table entries instead. | Length prefixed set of items. | ||
| FlatBuffers | Encoded as absence of field in parent object | Little-endian 2's complement signed and unsigned 8/16/32/64 bits | UTF-8-encoded, preceded by 32-bit integer length of string in bytes | Vectors of any other type, preceded by 32-bit integer length of number of elements | Tables or Vectors sorted by key | ||
| Ion | \x0f | \xbx Arbitrary length and overhead. Length in octets. | |||||
| MessagePack | \xc0 | Typecode + IEEE single/double | encoding is unspecified | ||||
| Netstrings | Length-encoded as an ASCII string + ':' + data + ',' Length counts only octets between ':' and ',' | ||||||
| OGDL Binary | |||||||
| Property list | |||||||
| Protocol Buffers | UTF-8-encoded, preceded by varint-encoded integer length of string in bytes | Repeated value with the same tag or, for varint-encoded integers only, values packed contiguously and prefixed by tag and total byte length | |||||
| Smile | \x21 | IEEE single/double, BigDecimal | Length-prefixed "short" Strings, marker-terminated "long" Strings and back-references | Arbitrary-length heterogenous arrays with end-marker | Arbitrary-length key/value pairs with end-marker | ||
| Structured Data eXchange Formats | Big-endian signed 24-bit or 32-bit integer | Big-endian IEEE double | Either UTF-8 or ISO 8859-1 encoded | List of elements with identical ID and size, preceded by array header with int16 length | Chunks can contain other chunks to arbitrary depth. | ||
| Thrift |