X Window System core protocol

The X Window System core protocol is the base protocol of the X Window System, which is a networked windowing system for bitmap displays used to build graphical user interfaces on Unix, Unix-like, and other operating systems. The X Window System is based on a client–server model: a single server controls the input/output hardware, such as the screen, the keyboard, and the mouse; all application programs act as clients, interacting with the user and with the other clients via the server. This interaction is regulated by the X Window System core protocol. Other protocols related to the X Window System exist, both built at the top of the X Window System core protocol or as separate protocols.
In the X Window System core protocol, only four kinds of packets are sent, asynchronously, over the network: requests, replies, events, and errors. Requests are sent by a client to the server to ask it to perform some operation and to send back data it holds. Replies are sent by the server to provide such data. Events are sent by the server to notify clients of user activity or other occurrences they are interested in. Errors are packets sent by the server to notify a client of errors occurred during processing of its requests. Requests may generate replies, events, and errors; other than this, the protocol does not mandate a specific order in which packets are sent over the network. Some extensions to the core protocol exist, each one having its own requests, replies, events, and errors.
X originated at MIT in 1984. Its designers Bob Scheifler and Jim Gettys set as an early principle that its core protocol was to "create mechanism, not policy". As a result, the core protocol does not specify the interaction between clients and between a client and the user. These interactions are the subject of separate specifications, such as the ICCCM and the freedesktop.org specifications, and are typically enforced automatically by using a given widget set.

Overview

Communication between server and clients is done by exchanging packets over a channel. The connection is established by the client. The client also sends the first packet, containing the byte order to be used and information about the version of the protocol and the kind of authentication the client expects the server to use. The server answers by sending back a packet stating the acceptance or refusal of the connection, or with a request for a further authentication. If the connection is accepted, the acceptance packet contains data for the client to use in the subsequent interaction with the server.
Image:Xcore-overview.svg|thumb|200px|An example interaction between a client and a server.
After connection is established, four types of packets are exchanged between client and server over the channel:

Request: The client requests information from the server or requests it to perform an action.
Reply: The server responds to a request. Not all requests generate replies.
Event: The server informs the client of an event, such as keyboard or mouse input, a window being moved, resized or exposed, etc.
Error: The server sends an error packet if a request is invalid. Since requests are queued, error packets generated by a request may not be sent immediately.

Request and reply packets have varying length, while event and error packets have a fixed length of 32 bytes.
Request packets are numbered sequentially by the server as soon as it receives them: the first request from a client is numbered 1, the second 2, etc. The least significant 16 bits of the sequential number of a request is included in the reply and error packets generated by the request, if any. They are also included in event packets to indicate the sequential number of the request that the server is currently processing or has just finished processing.

Windows

What is usually called a window in most graphical user interfaces is called a top-level window in the X Window System. The term window is also used to denote windows that lie within another window, that is, the subwindows of a parent window. Graphical elements such as buttons, menus, icons, etc. can be realized using subwindows.
Image:Some X windows.svg|frame|A possible placement of some windows: 1 is the root window, which covers the whole screen; 2 and 3 are top-level windows; 4 and 5 are subwindows of 2. The parts of a window that are outside its parent are not visible.
A client can request the creation of a window. More precisely, it can request the creation of a subwindow of an existing window. As a result, the windows created by clients are arranged in a tree. The root of this tree is the root window, which is a special window created automatically by the server at startup. All other windows are directly or indirectly subwindows of the root window. The top-level windows are the direct subwindows of the root window. Visibly, the root window is as large as the virtual desktop, and lies behind all other windows.
The content of a window is not always guaranteed to be preserved over time. In particular, the window content may be destroyed when the window is moved, resized, covered by other windows, and in general made totally or partly non-visible. In particular, content is lost if the X server is not maintaining a backing store of the window content. The client can request backing store for a window to be maintained, but there is no obligation for the server to do so. Therefore, clients cannot assume that backing store is maintained. If a visible part of a window has an unspecified content, an event is sent to notify the client that the window content has to be drawn again.
Every window has an associated set of attributes, such as the geometry of the window, the background image, whether backing store has been requested for it, etc. The protocol includes requests for a client to inspect and change the attributes of a window.
Windows can be InputOutput or InputOnly. InputOutput windows can be shown on the screen and are used for drawing. InputOnly windows are never shown on the screen and are used only to receive input.
Image:Xframe.png|left|thumb|350px|Anatomy of a FVWM window. The white area is the window as created and seen by the client application.
The decorative frame and title bar that is usually seen around windows are created by the window manager, not by the client that creates the window. The window manager also handles input related to these elements, such as resizing the window when the user clicks and drags the window frame. Clients usually operate on the window they created disregarding the changes operated by the window manager. A change it has to take into account is that re-parenting window managers, which almost all modern window managers are, change the parent of top-level windows to a window that is not the root. From the point of view of the core protocol, the window manager is a client, not different from the other applications.
Data about a window can be obtained by running the xwininfo program. Passing it the -tree command-line argument, this program shows the tree of subwindows of a window, along with their identifiers and geometry data.

Pixmaps and drawables

A pixmap is a region of memory that can be used for drawing. Unlike windows, pixmaps are not automatically shown on the screen. However, the content of a pixmap can be transferred to a window and vice versa. This allows for techniques such as double buffering. Most of the graphical operations that can be done on windows can also be done on pixmaps.
Windows and pixmaps are collectively named drawables, and their content data resides on the server. A client can however request the content of a drawable to be transferred from the server to the client or vice versa.

Graphic contexts and fonts

The client can request a number of graphic operations, such as clearing an area, copying an area into another, drawing points, lines, rectangles, and text. Beside clearing, all operations are possible on all drawables, both windows and pixmaps.
Most requests for graphic operations include a graphic context, which is a structure that contains the parameters of the graphic operations. A graphic context includes the foreground color, the background color, the font of text, and other graphic parameters. When requesting a graphic operation, the client includes a graphic context. Not all parameters of the graphic context affect the operation: for example, the font does not affect drawing a line.
The core protocol specifies the use of server-side fonts. Such fonts are stored as files, and the server accesses them either directly via the local filesystem or via the network from another program called font server. Clients can request the list of fonts available to the server and can request a font to be loaded or unloaded by the server. A client can request general information about a font and the space a specific string takes when drawn with a specific font.
Image:Xfontsel.png|thumb|450px|The xfontsel program allows the user to view the glyphs of a font.
The names of the fonts are arbitrary strings at the level of the X core protocol. The X logical font description conventions specify how fonts should be named according to their attributes. These conventions also specify the values of optional properties that can be attached to fonts.
The xlsfonts program prints the list of fonts stored in the server. The xfontsel program shows the glyphs of fonts, and allows the user to select the name of a font for pasting it in another window.
The use of server-side fonts is currently considered deprecated in favour of client-side fonts. Such fonts are rendered by the client, not by the server, with the support of the Xft or cairo libraries and the XRender extension. No specification on client-side fonts is given in the core protocol.