WebSocket


WebSocket is a computer communications protocol, providing a bidirectional communication channel over a single Transmission Control Protocol connection. The WebSocket protocol was standardized by the IETF as in 2011. The current specification allowing web applications to use this protocol is known as WebSockets. It is a living standard maintained by the WHATWG and a successor to The WebSocket API from the W3C.
WebSocket is distinct from HTTP used to serve most webpages. Although they are different, states that WebSocket "is designed to work over HTTP ports 443 and 80 as well as to support HTTP proxies and intermediaries", making the WebSocket protocol compatible with HTTP. To achieve compatibility, the WebSocket handshake uses the HTTP Upgrade header to change from the HTTP protocol to the WebSocket protocol.
The WebSocket protocol enables full-duplex interaction between a web browser and a web server with lower overhead than half-duplex alternatives such as HTTP polling, facilitating real-time data transfer from and to the server. This is achieved by providing a standardized way for the server to send content to the client without being first requested by the client, and allowing messages to be exchanged while keeping the connection open. In this way, a two-way ongoing conversation can take place between the client and the server. The communications are usually done over TCP port number 443, which is beneficial for environments that block non-web Internet connections using a firewall. Additionally, WebSocket enables streams of messages on top of TCP. TCP alone deals with streams of bytes with no inherent concept of a message. Similar two-way browser–server communications have been achieved in non-standardized ways using stopgap technologies such as Comet or Adobe Flash Player.
Most browsers support the protocol, including Google Chrome, Firefox, Microsoft Edge, Internet Explorer, Safari and Opera.
The WebSocket protocol specification defines ws and wss as two new uniform resource identifier schemes that are used for unencrypted and encrypted connections respectively. Apart from the scheme name and fragment, the rest of the URI components are defined to use URI generic syntax.

History

WebSocket was first referenced as TCPConnection in the HTML5 specification, as a placeholder for a TCP-based socket API. In June 2008, a series of discussions were led by Michael Carter that resulted in the first version of the protocol known as WebSocket.
Before WebSocket, port 80 full-duplex communication was attainable using Comet channels; however, Comet implementation is nontrivial, and due to the TCP handshake and HTTP header overhead, it is inefficient for small messages. The WebSocket protocol aims to solve these problems without compromising the security assumptions of the web.
The name "WebSocket" was coined by Ian Hickson and Michael Carter shortly thereafter through collaboration on the #whatwg IRC chat room, and subsequently authored for inclusion in the HTML5 specification by Ian Hickson. In December 2009, Google Chrome 4 was the first browser to ship full support for the standard, with WebSocket enabled by default. Development of the WebSocket protocol was subsequently moved from the W3C and WHATWG group to the IETF in February 2010, and authored for two revisions under Ian Hickson.
After the protocol was shipped and enabled by default in multiple browsers, the was finalized under Ian Fette in December 2011.
introduced compression extension to WebSocket using the DEFLATE algorithm on a per-message basis.

Web API

A web application may use the WebSocket interface to maintain bidirectional communications with a WebSocket server.

Client example

In TypeScript.

// Connect to server
const ws: WebSocket = new WebSocket;
// Receive ArrayBuffer instead of Blob
ws.binaryType = "arraybuffer";
// Set event listeners
ws.onopen = : void => ;
ws.onmessage = : void => ;
ws.onclose = : void => ;
ws.onerror = : void => ;

WebSocket interface

Protocol

Steps:
  1. Opening handshake: HTTP request and HTTP response.
  2. Frame-based message exchange: data, ping and pong messages.
  3. Closing handshake: close message.

    Opening handshake

The client sends an HTTP request and the server returns an HTTP response with status code 101 on success. HTTP and WebSocket clients can connect to a server using the same port because the opening handshake uses HTTP. Sending additional HTTP headers is allowed. HTTP headers may be sent in any order. After the Switching Protocols HTTP response, the opening handshake is complete, the HTTP protocol stops being used, and communication switches to a binary frame-based protocol.
HeaderValueMandatory
rowspan="4" Origin
Hostrowspan="6" -
Sec-WebSocket-Version13--
Sec-WebSocket-Keybase64-encode--
Sec-WebSocket-Acceptbase64-encode-
rowspan="4" ConnectionUpgrade-
Upgradewebsocket--
Sec-WebSocket-ProtocolThe request may contain a comma-separated list of strings indicating application-level protocols the client wishes to use. If the client sends this header, the server response must be one of the values from the list.rowspan="2" -
Sec-WebSocket-ExtensionsUsed to negotiate protocol-level extensions. The client may request extensions to the WebSocket protocol by including a comma-separated list of extensions. Each extension may have a parameter. The server may accept some or all extensions requested by the client. This field may appear multiple times in the request and must not appear more than once in the response.--

Example request:

GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ
Origin: http://example.com
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Version: 13

Example response:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Sec-WebSocket-Protocol: chat

The following Python code generates a random Sec-WebSocket-Key.

import base64
import os
print

The following Python code calculates Sec-WebSocket-Accept using Sec-WebSocket-Key from the example request above.

import base64
import hashlib
KEY: bytes = b"dGhlIHNhbXBsZSBub25jZQ"
MAGIC: bytes = b"258EAFA5-E914-47DA-95CA-C5AB0DC85B11"
print.digest))

Sec-WebSocket-Key and Sec-WebSocket-Accept are intended to prevent a caching proxy from re-sending a previous WebSocket conversation, and does not provide any authentication, privacy, or integrity.
Though some servers accept a short Sec-WebSocket-Key, many modern servers will reject the request with error "invalid Sec-WebSocket-Key header".

Frame-based message

After the opening handshake, the client and server can, at any time, send data messages and control messages to each other. A message is composed of one frame if unfragmented or at least two frames if fragmented.
Fragmentation splits a message into two or more frames. It enables sending messages with initial data available but complete length unknown. Without fragmentation, the whole message must be sent in one frame, so the complete length is needed before the first byte can be sent, which requires a buffer. It was proposed to extend this feature to enable multiplexing several streams simultaneously, but the protocol extension was never accepted.
  • An unfragmented message consists of one frame with FIN = 1 and opcode ≠ 0.
  • A fragmented message consists of one frame with FIN = 0 and opcode ≠ 0, followed by zero or more frames with FIN = 0 and opcode = 0, and terminated by one frame with FIN = 1 and opcode = 0.

    Frame structure

Opcodes

Client-to-server masking

A client must mask all frames sent to the server. A server must not mask any frames sent to the client. Frame masking applies XOR between the payload and the masking key. The following pseudocode describes the algorithm used to both mask and unmask a frame.
for i from 0 to payload_length − 1
payload := payload xor masking_key

Status codes

Server implementation example

In Python.
Note: recv returns up to the amount of bytes requested. For readability, the code ignores that, thus it may fail in non-ideal network conditions.

import base64
import hashlib
import struct
from typing import Optional
from socket import socket as Socket
def handle_websocket_connection -> None:
# Accept connection
conn, addr = ws.accept
# Receive and parse HTTP request
key: Optional = None
for line in conn.recv.split:
if line.startswith:
key = line.split
if key is None:
raise ValueError
# Send HTTP response
sec_accept = base64.b64encode.digest)
conn.sendall
# Decode and print frames
while True:
byte0, byte1 = conn.recv
fin: int = byte0 >> 7
opcode: int = byte0 & 0b1111
masked: int = byte1 >> 7
assert masked, "The client must mask all frames"
if opcode >= 8:
assert fin, "Control frames are unfragmentable"
# Payload size
payload_size: int = byte1 & 0b111_1111
if payload_size 126:
payload_size, = struct.unpack
assert payload_size > 125, "The minimum number of bits must be used"
elif payload_size 127:
payload_size, = struct.unpack
assert payload_size > 2**16-1, "The minimum number of bits must be used"
assert payload_size <= 2**63-1, "The most significant bit must be zero"
if opcode >= 8:
assert payload_size <= 125, "Control frames must have up to 125 bytes"
# Unmask
masking_key: bytes = conn.recv
payload: bytearray = bytearray
for i in range:
payload = payload ^ masking_key
print
if __name__ "__main__":
# Accept TCP connection on any interface at port 80
ws: Socket = Socket
ws.bind)
ws.listen

handle_websocket_connection