Table of contents
2. Introduction to RAID levels
I. Overview:
RAID (Redundant Array of Inexpensive Disks) is called a redundant array of cheap disks. The basic idea of RAID is to combine multiple cheap small disks into a large disk group, so that the performance can reach or exceed that of a large-capacity, expensive, and fast read and write disk.
Currently, RAID technology is mainly divided into two types: hardware-based RAID technology and software-based RAID technology.
In the Linux system, the RAID function can be simulated through the built-in software, which saves the cost of purchasing expensive hardware RAID controllers and greatly enhances the IO performance and reliability of the disk.
Because software is used to simulate the RAID function, its configuration is flexible and management is convenient. Using software RAID at the same time, you can also merge several physical disks into a larger virtual device to achieve performance improvement and data redundancy.
Of course, hardware-based RAID solutions will outperform software-based RAID technology in terms of performance, specifically in terms of the ability to detect and repair multi-bit errors, automatic detection of faulty disks, and array reconstruction.
2. Introduction to RAID levels
With the continuous development of RAID technology, there are now seven basic RAID levels from RAID 0 to RAID 6, as well as a combination of RAID 0+RAID 1, called RAID10, and a combination of RAID 0+RAID 5. , called RAID50. Of course, the level does not represent the level of technology. RAID 2-RAID 4 are basically no longer used.
Currently, the Linux kernel can support these commonly used RAID levels. The software RAID in the Linux 2.6 kernel can support the following levels: RAID 0, RAID 1, RAID 4, RAID 5 and RAID 6, etc. In addition to supporting the above RAID levels, the Linux 2.6 kernel can also support LINEAR (linear mode) soft RAID. Linear mode combines two or more disks into one physical device. The disks do not have to be the same size. Disk A is filled first when writing to the RAID device, then disk B, and so on.
RAID 0
Also known as striped mode, continuous data is spread across multiple disks for access.
When the system has a data request, it can be executed by multiple disks in parallel, and each disk executes its own part of the data request. This kind of parallel operation on data can make full use of the bandwidth of the disk bus and significantly improve the overall disk access performance.
Because reads and writes are done in parallel on the device, read and write performance will increase, which is often the main reason to run RAID 0.
But RAID 0 has no data redundancy, and if one of the hard drives fails, no data can be recovered.
RAID 1
RAID 1, also known as Mirroring, is a fully redundant mode.
RAID 1 can be used for two or 2xN disks, and uses 0 or more spare disks . Every time data is written, it is written to the mirror disk at the same time.
This type of array is highly reliable, but its effective capacity is reduced to half of the total capacity , and the disks should be of equal size, otherwise the total capacity is only the size of the smallest disk.
RAID 4
Three or more disks are required to create RAID 4, which stores parity information on one disk and writes data to the other disks in RAID 0 fashion. Because one disk is reserved for verification information, the space size of the array is (N-l)*S, where S is the size of the smallest disk in the array. Just like in RAID 1, the disks should be equal in size.
If one disk fails, the parity information and the other disk can be used to reconstruct the data. If both disks fail, all data will be lost. The reason this level is not often used is that the checksum information is stored on a disk. This information must be updated every time another disk is written.
Therefore, when writing a large amount of data, it is easy to cause a bottleneck in the verification disk, so this level of RAID is rarely used at present.
RAID 5
RAID 5 is probably the most useful RAID mode when you want to combine a large number of physical disks and still retain some redundancy. RAID 5 can be used with three or more disks and uses zero or more spare disks. Just like RAID 4, the size of a RAID5 device is (N-1)*S.
The biggest difference between RAID5 and RAID4 is that the parity information is evenly distributed on each drive, thus avoiding the bottleneck problem that occurs in RAID 4. If one of the disks fails, all data remains intact thanks to the parity information. If a spare disk is available, synchronization of data to the spare disk will begin immediately after a device failure. If both disks fail at the same time, all data is lost.
RAID 6
RAID 6 is an extension of RAID 5. Like RAID 5, data and parity codes are divided into data blocks and stored on each hard disk of the disk array.
Only one more check disk is added to RAID 6 to back up the check codes distributed on each disk. In this way, the RAID 6 disk array allows two disks to fail at the same time , so the RAID 6 disk array requires at least four hard disks .
RAID1 + 0:
After N (even number, N>=4) disks are mirrored in pairs, they are combined into a RAID 0. The capacity is N/2, N/2 disks can be written at the same time, and the writing speed is average. N fast disks can be read at the same time, and the reading speed is fast. High performance and high reliability.
RAID 0 |
RAID 1 |
RAID 5 |
RAID 6 |
RAID 10 |
RAID50 _ |
|
Features |
Parallel execution Fast reading and writing speed |
Complete data redundancy |
Fast reading and writing speed Data is verified |
Fast reading and writing speed Data is verified and has high redundancy |
Read and write faster Data has checksum |
Read and write faster Data has checksum |
shortcoming |
No data redundancy |
Slow reading and writing speed |
Too many disks |
Too many disks |
Too many disks |
|
Number of disks |
1 |
2 |
3 |
4 |
4 |
6 |
Fault Number of disks |
0 |
1 |
1 |
2 |
2 |
2 |
space |
sum |
1/2 |
(N-1)*S |
(N-2)*S |
1/2 |
(N-2)*S |
Scenes |
Large space testing |
system |
Regular services |
database |