RAID


RAID is an orchestrated approach to computer data storage in which data is written to more than one secondary storage device. Instead of storing all data in a single hard disk drive or solid-state drive, RAID coordinates two or more such devices into a disk array. When the computer writes data to secondary storage, the RAID system distributes the data across the array. There are several possible ways of doing this, and those various configurations are called RAID levels.
RAID levels are distinguished by the amount of redundancy they afford and the minimum number of drives they require, as well as by their relative complexity, performance, energy efficiency, fault tolerance, and availability. The definitive techniques used by RAID were conceived in the 1970s and 1980s: data striping to improve read/write efficiency, and disk mirroring or parity drives for data recovery. With the exception of RAID 1, all of the standard RAID levels use storage virtualization to abstract multiple storage devices into one logical storage volume.
In its original coinage, RAID is an acronym for redundant array of inexpensive disks. The RAID Advisory Board later redefined the acronym to mean redundant array of independent disks. Before RAID, high-capacity, high-availability data storage relied on so-called SLEDs connected to mainframe computers. RAID has been deployed not only in mainframe computers, but in personal computers, supercomputers, file servers, database servers, web servers, and network-attached storage appliances.

History

The term RAID was coined by David Patterson, Garth Gibson, and Randy Katz at the University of California, Berkeley in 1987. In their June 1988 paper "A Case for Redundant Arrays of Inexpensive Disks ", presented at the SIGMOD Conference, they argued that the disk drives of the top-performing mainframe computers of the time could be outperformed by an array of the disk drives that were manufactured for the growing personal computer market. Although incidence of hard disk drive failure rises in proportion to the number of drives in use, the reliability of an array could far exceed that of any single, high-capacity drive if one built redundancy into the computer storage system by configuring it to write data to more than one disk automatically.
Although not yet using that terminology, the technologies of the five levels of RAID named in the June 1988 paper were used in various products prior to the paper's publication, including the following:
  • Mirroring was well established in the 1970s including, for example, Tandem NonStop Systems.
  • In 1977, Norman Ken Ouchi at IBM filed a patent disclosing what was subsequently named RAID 4.
  • Around 1983, DEC began shipping subsystem mirrored RA8X disk drives as part of its HSC50 subsystem.
  • In 1986, Clark et al. at IBM filed a patent disclosing what was subsequently named RAID 5.
  • Around 1988, the Thinking Machines' DataVault used error correction codes in an array of disk drives. A similar approach was used in the early 1960s on the IBM 353.
Industry manufacturers later redefined the RAID acronym to stand for "redundant array of independent disks".

Overview

Many RAID levels employ an error protection scheme called "parity", a widely used method in information technology to provide fault tolerance in a given set of data. Most use simple XOR, but RAID 6 uses two separate parities based respectively on addition and multiplication in a particular Galois field or Reed–Solomon error correction.
RAID can also provide data security with solid-state drives without the expense of an all-SSD system. For example, a fast SSD can be mirrored with a mechanical drive. For this configuration to provide a significant speed advantage, an appropriate controller is needed that uses the fast SSD for all read operations. Adaptec calls this "hybrid RAID".

Standard levels

Originally, there were five standard levels of RAID, but many variations have evolved, including several nested levels and many non-standard levels. RAID levels and their associated data formats are standardized by the Storage Networking Industry Association in the Common RAID Disk Drive Format standard:
  • RAID 0 consists of block-level striping, but no mirroring or parity. Assuming n fully used drives of equal capacity, the capacity of a RAID 0 volume matches that of a spanned volume: the total of the n drives' capacities. However, because striping distributes the contents of each file across all drives, the failure of any drive renders the entire RAID 0 volume inaccessible. Typically, all data is lost, and files cannot be recovered without a backup copy.
  • RAID 1 consists of data mirroring, without parity or striping. Data is written identically to two or more drives, thereby producing a "mirrored set" of drives. Thus, any read request can be serviced by any drive in the set. If a request is broadcast to every drive in the set, it can be serviced by the drive that accesses the data first, improving performance. Sustained read throughput, if the controller or software is optimized for it, approaches the sum of throughputs of every drive in the set, just as for RAID 0. Actual read throughput of most RAID 1 implementations is slower than the fastest drive. Write throughput is always slower because every drive must be updated, and the slowest drive limits the write performance. The array continues to operate as long as at least one drive is functioning.
  • RAID 2 consists of bit-level striping with dedicated Hamming-code parity. All disk spindle rotation is synchronized and data is striped such that each sequential bit is on a different drive. Hamming-code parity is calculated across corresponding bits and stored on at least one parity drive. This level is of historical significance only; although it was used on some early machines, as of 2014 it is not used by any commercially available system.
  • RAID 3 consists of byte-level striping with dedicated parity. All disk spindle rotation is synchronized and data is striped such that each sequential byte is on a different drive. Parity is calculated across corresponding bytes and stored on a dedicated parity drive. Although implementations exist, RAID 3 is not commonly used in practice.
  • RAID 4 consists of block-level striping with dedicated parity. This level was previously used by NetApp, but has now been largely replaced by a proprietary implementation of RAID 4 with two parity disks, called RAID-DP. The main advantage of RAID 4 over RAID 2 and 3 is I/O parallelism: in RAID 2 and 3, a single read I/O operation requires reading the whole group of data drives, while in RAID 4 one I/O read operation does not have to spread across all data drives. As a result, more I/O operations can be executed in parallel, improving the performance of small transfers.
  • RAID 5 consists of block-level striping with distributed parity. Unlike RAID 4, parity information is distributed among the drives, requiring all drives but one to be present to operate. Upon failure of a single drive, subsequent reads can be calculated from the distributed parity such that no data is lost. RAID 5 requires at least three disks. Like all single-parity concepts, large RAID 5 implementations are susceptible to system failures because of trends regarding array rebuild time and the chance of drive failure during rebuild. Rebuilding an array requires reading all data from all disks, opening a chance for a second drive failure and the loss of the entire array.
  • RAID 6 consists of block-level striping with double distributed parity. Double parity provides fault tolerance up to two failed drives. This makes larger RAID groups more practical, especially for high-availability systems, as large-capacity drives take longer to restore. RAID 6 requires a minimum of four disks. As with RAID 5, a single drive failure results in reduced performance of the entire array until the failed drive has been replaced. With a RAID 6 array, using drives from multiple sources and manufacturers, it is possible to mitigate most of the problems associated with RAID 5. The larger the drive capacities and the larger the array size, the more important it becomes to choose RAID 6 instead of RAID 5. RAID 10 also minimizes these problems.

    Nested (hybrid) RAID

In what was originally termed hybrid RAID, many storage controllers allow RAID levels to be nested. The elements of a RAID may be either individual drives or arrays themselves. Arrays are rarely nested more than one level deep.
The final array is known as the top array. When the top array is RAID 0, most vendors omit the "+".
  • RAID 0+1: creates two stripes and mirrors them. If a single drive failure occurs then one of the mirrors has failed, at this point it is running effectively as RAID 0 with no redundancy. Significantly higher risk is introduced during a rebuild than RAID 1+0 as all the data from all the drives in the remaining stripe has to be read rather than just from one drive, increasing the chance of an unrecoverable read error and significantly extending the rebuild window.
  • RAID 1+0: creates a striped set from a series of mirrored drives. The array can sustain multiple drive losses so long as no mirror loses all its drives.
  • JBOD RAID N+N: With JBOD, it is possible to concatenate disks, but also volumes such as RAID sets. With larger drive capacities, write delay and rebuilding time increase dramatically. By splitting a larger RAID N set into smaller subsets and concatenating them with linear JBOD, write and rebuilding time will be reduced. If a disk array controller is not capable of nesting linear JBOD with RAID N, then linear JBOD can be achieved with OS-level software RAID in combination with separate RAID N subset volumes created within one, or more, hardware RAID controller. Besides a drastic speed increase, this also provides a substantial advantage: the possibility to start a linear JBOD with a small set of disks and to be able to expand the total set with disks of different size, later on. There is another advantage in the form of disaster recovery.