Content of the lesson:
RAID (shortcut from English phrase Redundant Array of Independent Disks) is a label for disk controllers which allow you to connect two or more drives to improve performance and prevent faults.
The simplest implementation of a disk array is RAID 0, sometimes called as stripped. Data are saved in smaller blocks alternately on single drives.
The advantage is a small performance gain (the performance means speed of reading and writing here) but there is no protection against faults.
The system of saving data is illustrated in the following image:
Disk array RAID 1 which is also being called as mirroring requires an even number of hard drives. Data are duplicated on both drives at once (the same data is saved to both drives).
The advantage of this solution is the protection against faults - in case that one drive fails, you have a backup copy so you can replace the wrong disk and do not lose any data. The disadvantage is that you use only 50 % of the total capacity (in case you use hard drives with different capacities, the smaller value is used). On the top of that this solution does not increase the performance.
The system of saving data when two drives are used is illustrated in the following image:
The array of type RAID 3 requires minimally three drives and allows you to repair data in case that one disk fails (the remaining data can be computed). Of course there is a risk because more drives can fail at once.
You have to know two terms at the beginning: parity and logical function XOR.
All data on the drive are stored as a string of binary data - bits. These bits can represent numerical values, image information - for example numerical expression of color items of an image, text - single characters of text are converted to numerical values according to the encoding table, ... The following part of the text will use values 0 and 1 which will represent a fictive data on a hard drive (of course there is much more data on a real drive).
Now we can move to the definition of the term parity. Parity is one of methods to prevent faults which allows you to detect and remove one error. For this detection it uses the logical operation of exclusive product (shortly XOR). The result follows this table (in case of two drives with data and one drive with control product).
Value 1 | Value 2 | Result |
---|---|---|
0 | 0 | 0 |
0 | 1 | 1 |
1 | 0 | 1 |
1 | 1 | 0 |
If you look at the table you can see that in case the pair of values is the same, the result is 0, otherwise the result is 1.
You can also use a situation when you use more data drives. Take a look at an example where we have 3 drives for data and one drive for parity.
Value 1 | Value 2 | Value 3 | Result |
---|---|---|---|
0 | 0 | 0 | 0 |
0 | 0 | 1 | 1 |
0 | 1 | 0 | 1 |
0 | 1 | 1 | 0 |
1 | 0 | 0 | 1 |
1 | 0 | 1 | 0 |
1 | 1 | 0 | 1 |
1 | 1 | 1 | 0 |
How will the XOR look like when using more input values? In this case the expression will be evaluated from the left to the right. We can explain the row in bold from the previous table: the calculation is 0 XOR 1 XOR 1 which can be also written as (0 XOR 1) XOR 1 = 1 XOR 1 = 0.
You can also use the XOR operation in cryptography for example.
The disk array RAID 3 requires at least 3 drives. One is used for parity data and the remaining (at least 2) drives are used for data. The advantage is detection and correction of faults - in case that one drive fails, you can restore information from the remaining drives without problem. However, there is one disadvantage because you cannot use the whole capacity - one drive has to be used for parity data.
The following table demonstrates the process of saving data on three drives which will be used to demonstrate a recovery process. One cell means one bit to simplify the situation.
Note: Parity is computed from the previous table for the exclusive product. In case that you used more hard drives, it would be computed gradually (the logical product of the first two values at first, and then the following values would be added). For example you can see data with values 0 and 1 in the first line, so the exclusive logical product is 1 - this value is saved to the drive with parity data.
In case that one drive fails, the data can be computed from the other drives. This situation is illustrated in the following table. Damaged data are replaced with the "X" sign.
Note: Damaged data which will be computed are replaced with the "X" sign. For example the parity value in the first line is 1, the first disk contains value 0. The operation 0 XOR X = 1 will return the damaged data - you will get the value for X according to the table for logical product.
Disk array RAID 5 is based on the same principle as the RAID 3 with one difference - parity data is saved to different drives (the control sum is not stored on one drive as in the previous case). This solution slightly increases the performance (parity data are not saved to one drive so that drive is not overloaded by repeated writing process). The next advantages and necessary conditions are the same as for RAID 3.
Arrays of type RAID 0+1 and RAID 10 are based on a similar principle. They combine arrays RAID 0 (stripped) and RAID 1 (mirroring). Thanks to this combination they use both advantages. The advantage of this solution is higher speed of writing (two blocks of data are written at once) and full backup of data (data are written to two disks in the same appearance). The disadvantage is higher price because you need to buy 4 drives and you will use only 50 % of the full capacity.
Array RAID 0+1 consists of two arrays RAID 0 which are connected to the array RAID 1. The way how the data is saved is illustrated in the following image.
Array RAID 10 (sometimes also labeled as 1+0) is very similar to array RAID 0+1. The only difference is in the way of connection - this array consists of two arrays RAID 1 which are connected to the array RAID 0. The principle of this array is illustrated in the following image.
Besides the mentioned types of RAID arrays which are the most used ones nowadays, there are also different, less typical variants. Additional information about RAID arrays can be found here: http://cs.wikipedia.org/wiki/RAID or here: http://connect.zive.cz/node/456.