Protection ring


In computer science, hierarchical protection domains, often called protection rings, are mechanisms to protect data and functionality from faults and malicious behavior.
Computer operating systems provide different levels of access to resources. A protection ring is one of two or more hierarchical levels or layers of privilege within the architecture of a computer system. This is generally hardware-enforced by some CPU architectures that provide different CPU modes at the hardware or microcode level. Rings are arranged in a hierarchy from most privileged to least privileged. On most operating systems, Ring 0 is the level with the most privileges and interacts most directly with the physical hardware such as certain CPU functionality and I/O controllers.
Special mechanisms are provided to allow an outer ring to access an inner ring's resources in a predefined manner, as opposed to allowing arbitrary usage. Correctly gating access between rings can improve security by preventing programs from one ring or privilege level from misusing resources intended for programs in another. For example, spyware running as a user program in Ring 3 should be prevented from turning on a web camera without informing the user, since hardware access should be a Ring 1 function reserved for device drivers. Programs such as web browsers running in higher numbered rings must request access to the network, a resource restricted to a lower numbered ring.
X86S, a canceled Intel architecture published in 2024, has only ring 0 and ring 3. Ring 1 and 2 were to be removed under X86S since modern operating systems never utilize them.

Implementations

Multiple rings of protection were among the most revolutionary concepts introduced by the Multics operating system, a highly secure predecessor of today's Unix family of operating systems. The GE 645 mainframe computer did have some hardware access control, including the same two modes that the other GE-600 series machines had, and segment-level permissions in its memory management unit, but that was not sufficient to provide full support for rings in hardware, so Multics supported them by trapping ring transitions in software; its successor, the Honeywell 6180, implemented them in hardware, with support for eight rings; Protection rings in Multics were separate from CPU modes; code in all rings other than ring 0, and some ring 0 code, ran in slave mode.
However, most general-purpose systems use only two rings, even if the hardware they run on provides more CPU modes than that. For example, Windows 7 and Windows Server 2008 use only two rings, with ring 0 corresponding to kernel mode and ring 3 to user mode, because earlier versions of Windows NT ran on processors that supported only two protection levels.
Many modern CPU architectures include some form of ring protection, although the Windows NT operating system, like Unix, does not fully utilize this feature. OS/2 does, to some extent, use three rings: ring 0 for kernel code and device drivers, ring 2 for privileged code, and ring 3 for unprivileged code. Under DOS, the kernel, drivers and applications typically run on ring 3, whereas 386 memory managers such as EMM386 run at ring 0. In addition to this, DR-DOS' EMM386 3.xx can optionally run some modules on ring 1 instead. OpenVMS uses four modes called Kernel, Executive, Supervisor and User.
A renewed interest in this design structure came with the proliferation of the Xen VMM software, ongoing discussion on monolithic vs. micro-kernels, Microsoft's Ring-1 design structure as part of their NGSCB initiative, and hypervisors based on x86 virtualization such as Intel VT-x.
The original Multics system had eight rings, but many modern systems have fewer. The hardware remains aware of the current ring of the executing instruction thread at all times, with the help of a special machine register. In some systems, areas of virtual memory are instead assigned ring numbers in hardware. One example is the Data General Eclipse MV/8000, in which the top three bits of the program counter served as the ring register. Thus code executing with the virtual PC set to 0xE200000, for example, would automatically be in ring 7, and calling a subroutine in a different section of memory would automatically cause a ring transfer.
The hardware severely restricts the ways in which control can be passed from one ring to another, and also enforces restrictions on the types of memory access that can be performed across rings. Using x86 as an example, there is a special gate structure which is referenced by the call instruction that transfers control in a secure way towards predefined entry points in lower-level rings; this functions as a supervisor call in many operating systems that use the ring architecture. The hardware restrictions are designed to limit opportunities for accidental or malicious breaches of security. In addition, the most privileged ring may be given special capabilities.
ARM version 7 architecture implements three privilege levels: application, operating system, and hypervisor. Unusually, level 0 is the least-privileged level, while level 2 is the most-privileged level. ARM version 8 implements four exception levels: application, operating system, hypervisor, and secure monitor / firmware, for AArch64 and AArch32.
Ring protection can be combined with processor modes in some systems. Operating systems running on hardware supporting both may use both forms of protection or only one.
Effective use of ring architecture requires close cooperation between hardware and the operating system. Operating systems designed to work on multiple hardware platforms may make only limited use of rings if they are not present on every supported platform. Often the security model is simplified to "kernel" and "user" even if hardware provides finer granularity through rings.

Modes

Supervisor mode

In computer terms, supervisor mode is a hardware-mediated flag that can be changed by code running in system-level software. System-level tasks or threads may have this flag set while they are running, whereas user-level applications will not. This flag determines whether it would be possible to execute machine code operations such as modifying registers for various descriptor tables, or performing operations such as disabling interrupts. The idea of having two different modes to operate in comes from "with more power comes more responsibility" a program in supervisor mode is trusted never to fail, since a failure may cause the whole computer system to crash.
Supervisor mode is "an execution mode on some processors which enables execution of all instructions, including privileged instructions. It may also give access to a different address space, to memory management hardware and to other peripherals. This is the mode in which the operating system usually runs."
In a monolithic kernel, the operating system runs in supervisor mode and the applications run in user mode. Other types of operating systems, like those with an exokernel or microkernel, do not necessarily share this behavior.
Some examples from the PC world:
  • Linux, macOS and Windows are three operating systems that use supervisor/user mode. To perform specialized functions, user mode code must perform a system call into supervisor mode or even to the kernel space where trusted code of the operating system will perform the needed task and return the execution back to the userspace. Additional code can be added into kernel space through the use of loadable kernel modules, but only by a user with the requisite permissions, as this code is not subject to the access control and safety limitations of user mode.
  • DOS, as well as other simple operating systems and many embedded devices run in supervisor mode permanently, meaning that drivers can be written directly as user programs.
Most processors have at least two different modes. The x86-processors have four different modes divided into four different rings. Programs that run in Ring 0 can do anything with the system, and code that runs in Ring 3 should be able to fail at any time without impact to the rest of the computer system. Ring 1 and Ring 2 are rarely used, but could be configured with different levels of access.
In most existing systems, switching from user mode to kernel mode has an associated high cost in performance. It has been measured, on the basic request getpid, to cost 1000–1500 cycles on most machines. Of these just around 100 are for the actual switch, the rest is "kernel overhead". In the L3 microkernel, the minimization of this overhead reduced the overall cost to around 150 cycles.
Maurice Wilkes wrote:
... it eventually became clear that the hierarchical protection that rings provided did not closely match the requirements of the system programmer and gave little or no improvement on the simple system of having two modes only. Rings of protection lent themselves to efficient implementation in hardware, but there was little else to be said for them. The attractiveness of fine-grained protection remained, even after it was seen that rings of protection did not provide the answer... This again proved a blind alley...

To gain performance and determinism, some systems place functions that would likely be viewed as application logic, rather than as device drivers, in kernel mode; security applications and operating system monitors are cited as examples. At least one embedded database management system, eX''treme''DB Kernel Mode, has been developed specifically for kernel mode deployment, to provide a local database for kernel-based application functions, and to eliminate the context switches that would otherwise occur when kernel functions interact with a database system running in user mode.
Functions are also sometimes moved across rings in the other direction. The Linux kernel, for instance, injects into processes a vDSO section which contains functions that would normally require a system call, i.e. a ring transition. Instead of doing a syscall these functions use static data provided by the kernel. This avoids the need for a ring transition and so is more lightweight than a syscall. The function gettimeofday can be provided this way.