Soggy Software - moving mountains

RAID rant.

(That is, when RAID 5 isn't the right answer.)

If I've heard one of the following statements once, I've heard them all a hundred times:

Through reading, research, testing and discussion; I present you the RAID rant. Well, it's less of a rant and more of a collection of useful information which should correct anyone thinking any of the above statements are true.

I feel it is my responsibility to future generations of sysadmins, to ensure that a serious attempt at debunking these common and dangerous myths about RAID be documented somewhere on the Internet for free.

Facts about drives

According to independent studies by Google and Carnegie Mellon University annual failure and replacement rates are alarmingly high for drives even within their first 3-5 years of operation:

The observed range of AFRs varies from 1.7%, for drives that were in their first year of operation, to over 8.6%, observed in the 3-year old population. - E. Pinheiro, W. Weber and L. A. Barroso, p. 4
Most commonly, the observed ARR values are in the 3% range. - B. Schroeder, G. A. Gibson, p7

The astute will observe that Google's tests were performed using "consumer grade" SATA drives - somewhat surprisingly (to me) the research performed by CMU showed no real difference in SATA vs FC vs SCSI drives:

It is interesting to observe that for these data sets there is no significant discrepancy between replacement rates for SCSI and FC drives, commonly represented as the most reliable types of disk drives, and SATA drives, frequently described as lower quality. - B. Schroeder, G. A. Gibson, p7

We have to understand then, from the start, that drives die. More than that, drives die frequently in the first 5 years of their use. (For the sake of this discussion, we'll assume an optimistic figure of 3%. We'll also assume that drives in production use will be "assumed" to work for 5 years before a diligent sysadmin gets a budget from their CIO to replace them.)

Striping, mirroring, parity, data-loss and speed

In order to begin understanding RAID, we need to define and understand the following 5 technical terms:

Dividing a large amount of data into many small chunks and distributing them across multiple locations (devices, e.g. drives). All these chunks (e.g. blocks) put together make only 1 full copy (no more and no less) of whatever originally existed. Zero redundancy (loosing any one chunk renders the overall data incomplete, and therefore we must assume such loss renders the entire data unusable).

Making multiple copies of the entire data originally provided. Mirroring can apply to any amount of data, we may choose to mirror a chunk (from the striping process described) or the entire original data. Mirroring creates redundancy; at the cost of doubling/trebling or more the data size, we can afford to loose some bits without necessarily loosing the ability to fully recreate the entire original.
For two or more sets of input data of the same size, a function (e.g. equation) which allows identification of corruption in any of the inputs. In RAID, parity usually extends beyond this, including a way to not only identify corruption but also "recover" the corrupted input. A simple example can be seen: if 4 inputs are A=1010, B=1100, C=0000, D=0111, we can exclusive OR (XOR) the inputs to create a parity: (A:1010 XOR B:1100) = 0110, (0110 XOR C:0000) = 0110, (0110 XOR D:0111) = P:0001. Our parity is 0001. Should we loose input B, it can be recalculated from the other 3 inputs and the parity: (A:1010 XOR C:0000) = 1010, (1010 XOR D:0111) = 1101, (1101 XOR P:0001) = B:1100. Parity takes time to calculate, but is never larger than any one input and can be used to rebuild a lost or corrupt input (provided only one input is damaged).
Simply, the irreversible loss of data. (Backups are form of 'mirror', and can be used to avoid data-loss; although often backups don't include the very latest data changes. Striping alone does not protect against data-loss, because any amount of data-loss causes total data-loss. Etc.)
In the context of RAID, there are three speeds of concern: read speed (time taken to retrieve data), write speed (time taken to update data), and rebuild speed (time taken to fully "rebuild" a RAID array, should one or more disks fail). These can be interlinked; e.g. the read speed may be affected by rebuild time.