X Window System protocols and architecture
2007 Schools Wikipedia Selection. Related subjects: Software
In computing, the X Window System (commonly X11 or X) is a network-transparent windowing system for bitmap displays. This article details the protocols and technical structure of X11.
The X client-server model and network transparency
X is based on a client-server model. An X server program runs on a computer with a graphical display and communicates with various client programs. The server accepts requests for graphical output (windows) and sends back user input (keyboard, mouse).
In X Window, the server runs on the user's computer, while the clients may run on a different machine. This is the reverse of the common configuration of client-server systems, where the client runs on the user's computer and the server runs on a remote computer. This reversal often confuses new X users. The X Window terminology takes the perspective of the program, rather than the end-user or the hardware: the remote programs connect to the X server display running on the local machine, and thus act as clients; the local X display accepts incoming traffic, and thus acts as a server.
The communication protocol between server and client runs network-transparently: the client and server may run on the same machine or on different ones, possibly with different architectures and operating systems. A client and server can communicate securely over the Internet by tunneling the connection over an encrypted connection.
Design principles
Bob Scheifler and Jim Gettys set out the early principles of X as follows (as listed in Scheifler/Gettys 1996):
- Do not add new functionality unless an implementor cannot complete a real application without it.
- It is as important to decide what a system is not as to decide what it is. Do not serve all the world's needs; rather, make the system extensible so that additional needs can be met in an upwardly compatible fashion.
- The only thing worse than generalizing from one example is generalizing from no examples at all.
- If a problem is not completely understood, it is probably best to provide no solution at all.
- If you can get 90 percent of the desired effect for 10 percent of the work, use the simpler solution. (See also Worse is better.)
- Isolate complexity as much as possible.
- Provide mechanism rather than policy. In particular, place user interface policy in the clients' hands.
The first principle was modified during the design of X11 to: "Do not add new functionality unless you know of some real application that will require it." X has largely kept to these principles since. The reference implementation is developed with a view to extension and improvement of the implementation, whilst remaining almost entirely compatible with the original 1987 protocol.
X Window core protocol
Communication between server and clients is done by exchanging packets over a network channel. The connection is established by the client, which sends the first packet. The server answers by sending back a packet stating the acceptance or refusal of the connection, or with a request for a further authentication. If the connection is accepted, the acceptance packet contains data for the client to use in the subsequent interaction with the server.
After connection is established, four types of packets are exchanged by the client and the server over the channel:
- Request: The client requests information from the server or requests it to perform an action.
- Reply: The server responds to a request. Not all requests generate replies.
- Event: The server sends an event to the client, e.g., keyboard or mouse input, or a window being moved, resized or exposed.
- Error: The server sends an error packet if a request is invalid. Since requests are queued, error packets generated by a request may not be sent immediately.
The X server provides a set of basic services. The client programs realize more complex functionalities by interacting with the server.
Windows
What is usually called a window in other graphical user interfaces is a top-level window in the X Window System. The term window is also used for windows that lie within another window, that is, the subwindows of a parent window. Graphical elements such as buttons, menus, icons, etc. are all realized using windows.
A window can only be created as a subwindow of a parent window. This makes the windows to be arranged in a tree, that is, a hierarchy. The root of this hierarchy is called the root window, which is automatically created by the server. The top-level windows are exactly the direct subwindows of the root window. Visibly, the root window is as large as the screen, and lies behind all other windows.
Identifiers
All data about windows, fonts, etc. is stored in the server. The client knows identifiers of these objects—integers it can use as names for them when interacting with the server. For example, if a client wishes a window to be created, it requests the server to create a window with a given identifier. The server creates a window and associates it with the identifier. The identifier can be later used by the client to request, for example, a string to be drawn in the window.
Identifiers are unique to the server, not only to the client; for example, no two windows have the same identifier, even if created by two different clients. A client can access any object given its identifier, even if the object has been created by another client.
Attributes and properties
Every window has a predefined set of attributes and a set of properties, all stored in the server and accessible to the clients via appropriate requests. Attributes are data about the window, such as its size, position, background colour, etc. Properties are pieces of data that are attached to a window. Contrary to attributes, properties have no meaning at the level of the X Window core protocol. A client can store arbitrary data in a property of a window.
A property is characterized by a name, a type, and a value. Properties are similar to variables in imperative programming languages, in that the application can create a new property with a given name and of a given type and store a value in it. Properties are associated to windows: two properties with the same name can exist on two different windows while having different types and values.
Properties are mostly used for inter-client communication. For example, the property named WM_NAME
is used for storing the name for the window; window managers typically read this property and display the name of the window at the top of it.
The properties of a window can be shown using the xprop
program. In particular, xprop -root
shows the properties of the root window, which include the X resources (parameters of programs).
Events
Events are packets sent by the server to the client to communicate that something the client may be interested in has happened. A client can request the server to send an event to another client; this is used for communication between clients. For example, when a client requests the text that is currently selected, an event is sent to the client that is currently handling the window that holds the selection.
The content of a window may be destroyed in some conditions (for example, if the window is covered). Whenever an area of destroyed content is made visible, the server generates an Expose
event to notify the client that a part of the window has to be drawn.
Other events are used to notify clients of keyboard or mouse input, of the creation of new windows, etc.
Some kinds of events are always sent to client, but most kinds of event are sent only if the client previously stated an interest in them. This is because clients may only be interested in some kind of events. For example, a client may be interested in keyboard-related event but not in mouse-related events.
Colour modes
The way colors are handled in the X Window Systems sometimes confuse users, and historically several different modes has been supported. Most modern applications use TrueColor (24-bit color, 8 bits for each of red, green and blue), but old or specialist applications may require a different colour mode. Many commercial specialist applications use PseudoColor.
The X11 protocol actually uses a single 32-bit unsigned integer for representing a single colour in most graphic operations, called a pixelvalue. When transferring primary colors intensity, a 16 bits integer is used for each colour component. The following representations of colors exist; not all of them may be supported on a specific device.
- DirectColor: A pixel value is decomposed into separate red, green, and blue subfields. Each subfield indexes a separate colormap. Entries in all colormaps can be changed.
- TrueColor: Same as DirectColor, except that the colormap entries are predefined by the hardware and cannot be changed. Typically, each of the red, green, and blue colormaps provides a (near) linear ramp of intensity.
- GrayScale: A pixel value indexes a single colormap that contains monochrome intensities. Colormap entries can be changed.
- StaticGray: Same as GrayScale, except that the colormap entries are predefined by the hardware and cannot be changed.
- PseudoColor ( Chunky): A pixel value indexes a single colormap that contains colour intensities. Colormap entries can be changed.
- StaticColor: Same as PseudoColor, except that the colormap entries are predefined by the hardware and cannot be changed.
Xlib and other client libraries
Most client programs communicate with the server via the Xlib client library. In particular, most clients use libraries such as Xaw, Motif, GTK+, or Qt which in turn use Xlib for interacting with the server.
Inter-client communication
The X Window core protocol provides mechanisms for communication between clients: window properties and events, in particular the client-to-client message events. However, it does not specify any protocol for such interactions. These protocols are instead governed by a separate set of inter-client communication conventions.
The Inter-Client Communication Conventions Manual specifies the protocol for the exchange of data via selections and the interaction of applications with the window manager. This specification has been considered difficult and confusing; consistency of application look and feel and communication is typically addressed by programming to a given desktop environment.
The Inter-Client Exchange protocol (ICE) specifies a framework for building protocols for interaction between clients, so that a specific protocol can be built at the top of it. In particular, the X Session Management protocol (XSMP) is a protocol based on ICE that mandates over the interaction between applications with the session manager, which is the program that takes care of storing the status of the desktop at the end of an interactive session and recovering it when another session with the same user is started again.
Newer conventions are included in the freedesktop specifications, including the drag-and-drop convention Xdnd used for transferring data by selecting it and dragging in another window and the embedded application convention Xembed which details how an application can be run in a subwindow of another application.
Selections, cut buffers, and drag-and-drop
Selections, cut buffers, and drag-and-drop are the mechanisms used in the X Window System to allow a user to transfer data from a window to another. Selections and cut buffer are used (typically) when a user selects text or some other data in a window and paste in another one. Drag-and-drop is used when a user selects something in a window, then clicks on the selection and drags it into another window.
Since the two windows may be handled by two different applications, data transfer requires two different clients connected with the same X server to interact. The X Window core protocol includes some types of requests and events that are specific to selection exchange, but the transfer is mainly done using the general client-to-client event sending and window properties, which are not specific to selection transfer.
Data to be transferred between clients can be of different types: it is usually text, but can also be a pixmap, a number, a list of objects, etc.
Selections and drag-and-drop are active mechanisms: after some text has been selected in a window, the client handling the window must actively support a protocol for transferring the data to the application requesting it. On the contrary, cut buffers are a passive mechanism: when the user selects some text, its content is transferred to a cut buffer, where it remains even if the application handling the window terminates and the window is destroyed.
Window manager
A window manager is a program that controls the general appearance of windows and other graphical elements of the graphical user interface. Differences in the look of X Window System in different installations is mainly due to the use of different window managers or different configurations of the window manager.
The window manager takes care of deciding the position of windows, placing the decorative border around them, handling icons, handling mouse clicks outside windows (on the “background”), handling certain keystrokes (for example, iconifying a window when ALT-F4 is pressed), etc.
From the point of view of the X server, the window manager is not different from the other clients. The initial position and the decorative borders around windows are handled by the window manager using the following requests:
- an application can request the server not to satisfy requests of mapping (showing) subwindows of a given window, and to be sent an event instead;
- an application can request changing the parent of a window.
The window manager uses the first request to intercept any request for mapping top-level windows (children of the root window). Whenever another application requests the mapping of a top-level window, the server does not do it but sends an event to the window manager instead. Most window managers reparents the window: they create a larger top-level window (called the frame window) and reparent the original window as a child of it. Graphically, this corresponds to placing the original window inside the frame window. The space of the frame window that is not taken by the original window is used for the decorative frame around the window (the “border” and the “title bar”).
The window manager manages mouse clicks in the frame window. This allows for example to move or resize the window when the user clicks and drags on the border or on the title bar.
The window manager is also responsible for the handling of icons and related visual elements of the graphical user interface. Icons do not exist at the level of the X Window core protocol. They are implemented by the window manager. For example, whenever a window has to be “iconified”, the window manager FVWM unmaps the window, making it not visible, and creates a window for the icon name and possibly another window for the icon image. The meaning and handling of icons is therefore completely decided by the window manager: some window managers such as wm2 do not implement icons at all.
Session manager
Roughly, the state of a session is the “state of the desktop” at a given time: a set of windows with their current content. More precisely, it is the set of applications managing these windows and the information that allow these applications to restore the condition of their managed windows if required. An X session manager is a program that saves and restore the state of sessions.
The most recognizable effect of using a session manager is the possibility of logging out from an interactive session and then finding exactly the same windows in the same state when logging in again. For this to work, the session manager program stores the names of the running applications at logout and starts them again at login. In order for the state of the applications to be restored as well (which is needed to restore the content of windows), the applications must be able to save their state of execution upon request from the session manager and load it back when they start again.
The X Window System include a default session manager called xsm
. Other session managers have been developed for specific desktop systems: for example, ksmserver
is the default session manager of KDE.
X display manager
The X display manager is the program that shows the graphical login prompt in the X Window System. More generally, a display manager runs one or more X servers on the local computer and accepts incoming connections from X servers running on remote computers. The local servers are started by the display manager, which then connects to them to present the user the login screen. The remote servers are started independently from the display manager and connect to it. In this situation, the display manager works like a graphical telnet server: an X server can connect to the display manager, which starts a session; the programs of this sessions run on the same computer of the display manager but have input and output on the computer where the X server runs (which is the computer in front of the user).
XDM is the basic display manager supplied with the X Window System. Other display manager include GDM (GNOME), KDM ( KDE), WDM (using the WINGs widget set used in Window Maker) and entrance (using the architecture used in Enlightenment v.17).
User interface elements
Early widget toolkits for X included Xaw (the Athena Widget Set), OLIT (OPEN LOOK Intrinsics Toolkit), XView, Motif and Tk. OLIT and XView function as the base toolkits for AT&T and Sun's OPEN LOOK GUI.
Motif provides the base toolkit for the Common Desktop Environment (CDE), which is the standard desktop environment used on commercial Unix systems such as Solaris and HP-UX. (GNOME is offered in Solaris 9 and will be standard in future versions.)
More modern toolkits include Qt (used by KDE), GTK+ (used by GNOME), wxWidgets, FLTK and FOX.
Extensions
The X server was designed to be simple but extensible. As such, much functionality now resides in extensions to the protocol. The following is a partial list of extensions that have been developed, sorted roughly by recency of introduction:
- AIGLX
- Composite
- Damage
- XFixes
- Extended-Visual-Information (EvIE)
- Dual Multihead (DMX)
- XvMC, video with motion compensation
- GLX
- XRender
- Resize and Rotate (RANDR)
- Xinerama
- Display Power Management Signaling (DPMS)
- XPRINT
- Low Bandwidth Extension (LBX, obsolete)
- X keyboard extension
- DOUBLE-BUFFER
- RECORD
- XImage Extension (obsolete)
- MIT-SHM
- SYNC
- XTEST
- XInputExtension
- BIG-REQUESTS
- XC-MISC
- X video extension, also called Xv (not to be confused with the xv program)
- PEX (obsolete)
- Shape
- DEC-XTRAP
- MIT-SCREEN-SAVER
- MIT-SUNDRY-NONSTANDARD
- SECURITY
- TOG-CUP
- X-Resource
- XC-APPGROUP
- XFree86-Bigfont
- XFree86-DGA
- XFree86-Misc
- XFree86-VidModeExtension
At the protocol level, every extension can be assigned new request/event/error packet types. Access to client applications to the functionalities provided by extensions is facilitated by client libraries. The coding of extensions into the current X server implementations is reportedly difficult due to a lack of modularity in the server design. It is a long term goal of the XCB project to automate generating both the client and server sides of extensions from XML protocol descriptions.