Series RAID technology explain

Original link: http://www.cnblogs.com/jiweilearn/p/9502095.html

1、RAID 0

  RAID 0 is the n physical disks into a virtual logical disk, a physical disk that formed on each RAID 0 will form a continuous, but also physically continuously on a virtual logical disk. A disk controller (refer to the controller using the virtual disk, if a host card using the appropriate links with external disk array, then refers to the disk controller on the host) instructions issued to the virtual disk, are RAID and the controller receives the analysis process, the composition of the real physical disks each physical disk RAID0 IO request command conversion algorithm according to equation Block pair mapping relationship, after collecting or write data, to refer to the host disk controller.

 

  RAID 0 is also known as striping, which represents the highest level of all RAID storage performance. No data validation, visit the following analysis process from top to bottom RAID 0 disk.

  If at some point, the host controller instructs: 10000 length sector 128 reads the initial

  After the RAID controller receives this command, immediately calculated, the calculated sector number 10000 logical sector corresponding to the physical disk in accordance with the corresponding formula and sequentially calculates the next sequential sector number located 128 sectors on the disk physical logic. Again gives an instruction to the disk, respectively corresponding to the sectors. This is true of reading data, the disk received instructions to submit their data to the RAID controller, through a combination of the controller in the Cache, and submission to the host controller.

  Through the above process, if found that 128 sectors fall in the same Segment, then, that stripe depth capacity greater than the capacity of 128 sectors (64KB), then this can only truly IO from a read on physical disks, and performance compared to a single disk will slow down, because there is no optimization, but also adds additional computational overhead of RAID controller. Therefore, under certain conditions to improve performance, make possible a diffusion of IO to a plurality of physical disks, it is necessary to reduce the depth of the strip. The number of disks in the same conditions, namely to reduce the size of the stripe (Stripe SIZE, that is, the length of the strip), so that the data is divided IO controller, while a Segment filled first strip, the second Segment other two, and so on, this can greatly occupying multiple physical disks.

  So RAID 0 to boost performance, the band made as small as possible. But there is a contradiction there, is the band too small, resulting in reduced probability of concurrent IO, because if the strip is too small, each IO will occupy most of the physical disk, queue IO IO can only wait for the end of this after the physical disk to use, and the band is too large, it can not sufficiently increase the transmission speed. These two are a contradiction, to be used in different ways according to the needs.

2、RAID 1

  RAID 1 is called mirroring, it will be exactly the same data are written to work and the mirror disks, its disk space utilization is 50%. RAID1 when data is written, the response time will be affected, but when reading data is not affected.

RAID 1 for write IO, not only failed to improve the speed, but declined, at the same time should be written to multiple physical disk data, the slowest time of the subject, because it is synchronized. For a read IO requests RAID 1, not only complicated, but even when the order of IO, the controller may be the same as RAID 0, data is simultaneously read from two physical disk, the lifting speed.

 

  In reading, concurrent IO mode, since the N concurrent IO, IO each occupies a physical disk, which is equivalent to N times improved IOPS. Since each IO exclusively only one physical disk, so the data transfer speed with respect to a single disk does not change, so whether it is random or sequential IO, a single disk are relatively constant

3、RAID 2

  RAID2 called error correcting Hamming code disk array design is the use of redundant Hamming code for data verification. Hamming code is a check code added to a number in the original data for error detection and correction coding techniques, in which the first  2 n-  bit (1, 2, 4, 8, ...) is a check code, the other data symbol position . Thus RAID2, the data bit storage, magnetic disk storage each bit of data encoding, data is stored depending on the number of disks set width, set by the user. FIG 4 is a data width of 4 RAID2, it requires four data disks and the parity disk 3. If a 64-bit data width, the required data disk 64 and 7 parity disk. Be seen, the larger the data width RAID2, the higher the storage space utilization, but also the number of disks needed more.

  Hamming codes have their own error correction capability, thus RAID2 can occur in case of error data to correct errors, ensure data security. Its very high data transfer performance, design complexity lower than RAID3 described later, RAID4 and RAID5.

4、RAID 3

  RAID 3 is an array of parallel access using a dedicated parity disk, which uses a disk as a dedicated parity disk, as the disk remaining data disk, the data bit byte interleaved manner to the respective data disk. RAID3 disk requires at least three, with the same data in different areas on the disk for XOR parity, parity check value written to disk. RAID0 read performance and good exactly when RAID3, in parallel with reading data from a plurality of disk striping, very high performance, while also providing fault tolerance of data. When data is written to the RAID3, you must calculate the check value with all the same strip, and writes the new parity value check disk. A write operation comprising a write block, read data block with the strip, calculates the checksum value, write check value and other operating system overhead is very large, lower performance.

 

  If RAID3 in a disk fails, data will not affect the reading, can help verify data integrity and other data to reconstruct the data. If the data block to be read is located just a disk failure, the system needs to read all the data blocks in the same slice, and a check value according to reconstruct the lost data, system performance will be affected. When the failed disk is replaced, the system rebuilds the failed disk data to the new disk in the same way.

  Storage RAID3 only a parity disk, an array of high utilization, coupled with concurrent access feature, it is possible to provide a large number of high-performance read-write high bandwidth, app for sequential access large amounts of data, such as image processing, stream media services. Currently, RAID5 algorithm continue to improve, when large amounts of data can be read analog RAID3, RAID3 performance will decline and in the event of bad disk, and is often used to operating a RAID5 alternative RAID3 continuity, high bandwidth, a large number of read and write characteristics application.

Each RAID stripe 3, which length is designed to be a fast file system size, with the depth depending on the number of disks, but a minimum depth of one sector, so the size of each Segment is typically one sector or several sectors capacity

  Example Solution: RAID mechanism 3

  With a four data disks and a parity disk of the RAID 3 system, Segment SIZE is the size of two sectors (1KB), the strip length is 4KB

  RAID 3 to such a controller receives IO: 10000 length sector write the initial 8, i.e., the total number of 8 * 512B = 4KB

  LBA10000 positioning controller corresponds to the physical really the LBA, the first sector of the first Segment of LBA10000 if just the first strip, this IO controller in the first data write 512B 1,2 into this sector. The same time, a Segment 3, 4 in this case happens to be the amount of data is 4KB. That this IO 4KB of data simultaneously written separately the four disks, disk block is not written two sectors, that is, a Segment. They are written in parallel, including check disks are written in parallel, so the check disk RAID 3 is not the bottleneck, but there is a delay, because of the increased cost of calculating the checksum.

  If the IO SIZE is greater than the length of the strip, such as the controller receives the IO SIZE is 16KB, once the controller can be written in parallel is 4KB, 16KB which requires writing to four batches with Article 4, where in fact has written four batches not write, but write at the same time, this is the first 16KB 1,5,9,13KB by the controller continuously written to disk 1, 2,6,10,14 sequential write disk 2, and so on. 16KB finished until all the data is once written in parallel. Such a one-time check disk may be calculated check value and the parallel writing and data together, rather than in batches.  

 

5、RAID 4

  RAID 4 is not common, the principle is in the upper one disk controller drivers, file system layer that is the target LBA queue IO scanning, the target is the same with the IO allowed concurrent writes. That is, two different transactions IO write, try to put the same strip to improve write efficiency. The most typical way is to design NetApp WAFL file system's renowned, WAFL file system to ensure that the entire write operation can be achieved to maximize. Data blocks can be combined WAFL always writes the same time try to write a stripe to eliminate the write penalty IO concurrent increase coefficient.

6、RAID 5

  Introduce a few concepts:

  Write whole (Full-Stripee Write): a group to modify all parity stripe unit, thus the new XOR parity values ​​can be calculated based on all the data with the new article, no additional read and write operations . Therefore, the whole writing is the most effective type of writing. Examples of the entire written, RAID 2, RAID 3. they every IO always occupy almost all disc guaranteed, so that each strip of each Segment is written are updated, the controller can directly use the data updated after calculating the calibration data at the same time, data is written in the disk data, the write parity information computed parity disk.

  Reconstruction of writing (Reconstruct Write): If the number of disks to be written more than half the number of disk arrays can be taken to reconstruct write mode. Writing in the reconstruction, the read original data from this strip does not need to modify the Segment, and then the new data on the section to be modified with all XOR Segment check value calculated together, and the new data Segment and that have not changed and the new Segment data XOR checksum value is written together. Obviously, remodeling write to involve more I / O operations, and therefore less efficient than the entire writing.

  Read rewrite (Read-Modify Write): If the number of disks to be written less than half the number of disk arrays can be taken to rewrite the read mode. Read modify write process: the need to modify the Segment start to read the old data, and from the strip to read the old parity value; according to the old data, new data on the old parity and the need to modify the Segment calculate a new check value of the strip; last written new data and new parity value. The process includes reading, and a write cycle is modified, so called read-modify. Read rewrite equation for calculating a new check value: data = New parity data (new data and old data EOR) EOR old parity data. Segment be updated if more than half of the total number of Segment strips, read at this time is not suitable for rewriting, rewriting necessary to read because the read data and the check data in the Segment. If using a write reconstruction, only need to read the remaining data Ready to update data in the Segment to the latter is less than the former number. So with more than half of the reconstruction writing, with less than half read rewriting the entire written update on the whole.

  The efficiency of the order of write: Write whole> Reconstruction of write> Read rewritten.

  RAID 5 parity disk contention solve this problem, the practice of using RAID 5 distributed parity disk, the parity disk will break up in the RAID group of each disk. Each strip has a checksum Segment, but different bands different position, cycling distributed between adjacent strips. In order to ensure concurrent IO, RAID 5 stripe size likewise made larger, in order to ensure that each IO data does not fill the entire strip, resulting in a waiting queue of other IO. So, RAID 5 to ensure a high complication rate, once a concurrent time without success, then the IO read almost rewrite mode, RAID 5 has higher write punishment.

  Analyze the specific mechanism of action RAID 5, assume a stripe size of 80KB, each Segment size of 16KB. A certain time, the upper layer generates a write IO: 10000 length sector write the initial 8, i.e., the write data is 4KB. After the controller receives the IO, first locate really LBA address, assuming that the first one targeted to strip the second Segment (located on Disk 2) in the first sector, the first disk controller is launching this Segment IO read request, read the eight sectors of the original data to the Cache. At the same time, the controller also initiates a read request to the disk IO Segment parity strip is located, the read data corresponding parity sector, and stores the Cache. Using a check circuit EOR calculate a new check data, of the formula: new parity data = data (new data and old data EOR) EOR old parity data. Cache now exist: the old data, new data, old data and the new parity check data. The controller immediately initiate the same time once again to the corresponding disk IO write request, write new data to data Segment, the new data is written to the check verification Segment, and delete the old data and old parity data.

  In the above process, the IO has always been occupied by only 1,2 two disks, because the data to be updated corresponding check Segment Segment located one, never had to use any other disk. At this time, if a queue has the IO so, if it is located in the initial target LBA Segment data in the disk 4, the IO Segment length does not exceed the size of, and the corresponding parity strip Segment 3 is located on the disk. This two disk IO is not any other occupation, so in this case the controller can concurrently handle this on a IO and IO, to achieve concurrency.

  RAID 5 with respect to the specially optimized for RAID 4, at the bottom to achieve concurrent, wilting can be detached intervention system.

7、RAID 6

  Before any RAID level RAID 6, when up to protect a broken disk, data can still be accessed. If both two broken disk, the data will be lost. In order to increase the safety factor RAID 5, RAID 6 was founded. RAID 6 increases more than a RAID 5 parity disk, but also the distribution of each scattered on the disc, but with another equation to calculate the new parity data. Thus, RAID 6 in a strip while the parity data stored on two unrelated mathematics, this is possible while ensuring the bad case two disc, data may still be required by the mathematical relationship of these two simultaneous equations lost data. RAID 6 compared to RAID 5, reads or writes a check additional data while at the time of writing. But because it is operating in parallel at the same time, so much slower than RAID 5. Other features are similar and RAID 5.

Reproduced in: https: //www.cnblogs.com/jiweilearn/p/9502095.html

Guess you like

Origin blog.csdn.net/weixin_30764883/article/details/94797051