SQL database storage management -IO

SQL database storage management -IO

 

 

A disk

 The basic parameters of the disk

1.1 Capacity:

Capacity indicator also includes a hard disk platter capacity, the so-called single platter hard drive capacity is monolithic disc capacity, the greater the single-disc capacity, the lower the unit cost, average access time is shorter. They buy when he is 500G hard disk, but the actual capacity of less than 500G. Because the factory is based on 1MB = 1000KB be converted, so we bought a new hard drive, than the actual amount of time to buy a small point.

1.2 speed (Rotational Speed ​​or Spindle speed),

The rotational speed of the hard disk spindle motor, i.e. the maximum number of revolutions of the disk in the hard disk can be completed within one minute. Speed ​​speed is one of the important parameters indicate the grade of the hard disk, it is one of the key factors that determine the internal hard disk transfer rate, to a large extent directly affect the speed of the hard drive. The faster the speed of the hard disk, the hard disk to find the file, the faster the relative transmission rate of the hard disk will be improved. Hard disk rotational speed expressed in many revolutions per minute to, units of RPM, RPM Revolutions Per minute is an abbreviation is rev / min. RPM higher the value, the faster internal transfer rate, the shorter access time, the better the overall performance of the hard drive.

1.3 transfer rate

Transmission rate (Data Transfer Rate) disk speed data transfer rate is hard to read and write data in megabytes per second (MB / s). Hard disk data transfer rate and data transfer rate comprises an internal and external data transfer rate.

Internal transfer rate (Internal Transfer Rate), also known as sustained transfer rate (Sustained Transfer Rate), which reflects the performance of hard disk buffer is not used. Internal transfer rate depends on the rotational speed of the hard disk.

External transfer rate (External Transfer Rate) data transfer rate is also referred to as a burst (Burst Data Transfer Rate) or interface transfer rate, which is the nominal rate of data transfer between the system bus and the hard disk buffer, and external data transfer rate hard disk interface type and hard disk cache size.

Summary: The internal data transfer rate is a transmission rate bottleneck. Generally do not buy the current hard drive data, you need to test.

1.4 Average seek time (Averageseektime)

The average seek time means MO magneto-optical system after receiving the instruction, the average time to start moving the head from moving to the track where the data required, it refers to the computer address is found in issuing a command corresponding to the target data The time required, in milliseconds (ms).

Generally, the higher the rotational speed of the disk, the shorter average seek time.

Currently 10K minimum of 5ms, an average of 4ms.

Currently 15K minimum 4ms, an average of 3.5ms.

1.5 average rotational latency

By rotating the disk, so that the sector to be read to the bottom of the head, this time called rotational latency (rotational latency time).

7200 (r / min) of the hard disk, the time required for each revolution of 60 × 1000 ÷ 7200 = 8.33 ms, the average rotational latency of 8.33 ÷ 2 = 4.17 ms (average, requires half a turn).

1000 (r / min) of the hard disk, the time required for each revolution of 60 × 1000 ÷ 10000 = 6 ms, the average rotational latency is 6 ÷ 2 = 2 ms (average, requires half a turn).

 

 

1.6 hard disk interface

The connecting member between the hard disk and the host system, is to transfer data between host memory and the disk cache. Different hard disk interface determines the speed of the connection between the hard drive and the computer in the entire system, the merits of hard disk interface directly affects the running speed and the quality of system performance.

 

 

 

1.7 summary

From the above indicators, the most important thing, or that we should be most concerned with only two: seek time; rotational latency. But these two parameters, rely heavily on speed, so the database hard drive speed is most important.

Two IOPS and Throughput

2.1 Overview

In MB / sec is used to measure units of a certain number of bytes within one second from the memory transfer. OLAP, also known as online analytical processing (Online Analytical Processing) system, sometimes also called DSS decision support system, what we call data warehouse. OLAP mainly to see Throughput

 

  To IO / sec for the IOPS throughput indicates how many units can perform disk IO per second. OLTP Online Transaction Processing (Online Transaction Processing), OLTP mainly to see IOPS.

 

To MB / sec and IOPS that have a greater impact on database performance that perform random read and write OLTP database is mainly IOPS, OLAP analysis database for a large number of sequential read primarily MB / sec.

 

IOPS and Throughput Throughput is the main indicator to measure two parameters storage performance. Denotes the number of transmission IOPS storage IO per second, Throughput transmission amount of data per second throughput, said. Both under different circumstances can represent the performance conditions of storage, but the application of different scenarios. At the same time, between the two there is also another link

https://gss3.bdstatic.com/7Po3dSag_xI4khGkpoWK1HF6hhy/baike/c0%3Dbaike80%2C5%2C5%2C80%2C26/sign=580fb9fdda33c895b2739029b07a1895/d52a2834349b033b0c9e32b016ce36d3d539bd42.jpg

     The same single store summarized greater IO block, the better throughput Throughput, more suitable for OLAP. Single IO smaller the block, the more IOPS, more suitable for OLTP.

2.2 single-disk IOPS calculation

IOPS = 1 / IO Time

  IO Time = Seek Time + 60 sec/Rotational Speed/2 + IO Chunk Size/Transfer Rate

  Seek Time = average seek time (Averageseektime),

  (Rotational Speed ​​= disk speed

  Transfer Rate = Data transfer rate

  10K Disk different pages computing

  IO Time 8k=  5ms + (60sec/10000RPM/2) + 8K/40MB = 5 + 3+ 0.2 = 8.2ms

  IOPS8k =1/8.2ms=121  IOPS

  

IO Time16k = 5ms + (60sec / 10000rpm / 2) + 16K / 40MB = 5 + 3 + 0.4 = 8.4ms

IOPS16K=1/7.4MS=119  IOPS

  IO Time32k = 5ms + (60sec / 10000rpm / 2) + 32K / 40MB = 5 + 3 + 0.8 = 8.8

IOPS32K= (1/8.8 ms = 123 IOPS)

IO Time64k = 5ms + (60sec / 10000rpm / 2) + 64K / 40MB = 5 + 3 + 1.6 = 9.6

IOPS64K= (1/9.6 ms = 104IOPS)

Calculation of IOPS 2.3 Raid

On the RAID write punishment (Write Penalty) and IOPS calculations.

Usually when discussing the different types of performance RAID protection, the conclusion will be the RAID-1 provides better read and write performance, good RAID-5 read performance, but write performance is not as good as RAID-1, RAID-6 protection levels higher , but the write performance is relatively more poor, RAID10 is to provide the best performance and data protection, but the highest cost and so on. In fact, these factors determine the performance considerations are very simple, it is the RAID Write Penalty (write punishment). This article explains the different levels of RAID protection from punishment written on principle and method of calculating the IOPS is available by writing the punishment.

2.3.1 RAID5 IOPS calculation

 Storage solutions in the process of planning, there are two basic considerations, performance and capacity. See computing can be divided IOPS performance and bandwidth requirements. Calculating IOPS, aside cache memory array and a front port aside. Computing back-end physical disk IOPS can not simply put maximum IOPS obtained by adding physical disks. The reason is that, for different RAID levels, in order to ensure that data can be restored when there is damage to the physical disk, the data writing process needs to have in some special calculations. For example, RAID-5, data on any disk strip changes, recalculates the parity bit. As shown below, RAID-5 stripe of a 1 + 7, seven data disk storage, magnetic disk storage parity bit last.

    For writing a data, we assume that the data written on the fifth disk 1111, as shown below. Then we need to complete the entire RAID-5 write process is divided into the following steps:

0110 read original data, the new data and then do an XOR operation 1111: 0110 XOR 1111 = 1001

Read the old parity bit 0010

The first step in using the calculated values ​​do it again with the original parity bit XOR operation: 0010 XOR 1001 = 1011

The new data is then written to the data disk 1111, the third step is calculated the new parity bits are written parity disk.


 

     Seen from the above steps for any write-once, at the end of storage, requires two read-write + twice respectively, so that the value of the Write Penalty RAID-5 is 4.

 

The different RAID levels Write Penalty: 

The following table lists the values ​​for various Write Penalty RAID levels:

 

RAID

Write Penalty

0

1

1

2

5

4

6

6

10

2

 

 

RAID-0: Direct strip, each time the data is written on the write-once disk corresponding physical

RAID-1 and 10: Write RAID-1 and RAID-10 punishment is very simple to understand, because the mirrored data exists, so there will be two written once.

RAID-5: RAID-5 due to the presence of mechanisms to computation of parity bits required to read data, the read parity, the write data, a write parity bit four steps, the RAID-5 write penalty value is 4.

RAID-6: RAID-6 due to the presence of two parity bits, compared with RAID-5, parity bit needs to be read twice and written twice parity bit, so write penalty value is RAID-6 6.

 

 

Calculated IOPS:

     According to the above description, the program is actually stored in the design process, the calculation process is actually available to be incorporated into the IOPS in the RAID write penalty calculation. The following formula

IOPS × total number of physical disks in the disk physical disk IOPS =

Available IOPS = (total physical disk write IOPS × ÷ RAID write penalty percentage) + (Physical Disk Read IOPS × percentage of total)

     RAID-5 is assumed that the composition provide a total physical disk IOPS 500, the memory write proportion application is 50% / 50%, then for the front-end host, the actual available IOPS is:

(500 ×50% ÷ 4)+ ( 500 * 50%) = 312.5 IOPS

 Under normal circumstances, if 75% of the total IOPS of time, would have been IO performance problems.

2.3.2 raid5 contrast with raid10

The different RAID levels Write Penalty:

The following table lists the values ​​for various Write Penalty RAID levels:

 

RAID

Write Penalty

0

1

1

2

5

4

6

6

10

2

 

RAID-0: Direct strip, each time the data is written on the write-once disk corresponding physical

RAID-1 and 10: Write RAID-1 and RAID-10 punishment is very simple to understand, because the mirrored data exists, so there will be two written once.

RAID-5: RAID-5 due to the presence of mechanisms to computation of parity bits required to read data, the read parity, the write data, a write parity bit four steps, the RAID-5 write penalty value is 4.

RAID-6: RAID-6 due to the presence of two parity bits, compared with RAID-5, parity bit needs to be read twice and written twice parity bit, so write penalty value is RAID-6 6.

Calculated IOPS:

 

     According to the above description, the program is actually stored in the design process, the calculation process is actually available to be incorporated into the IOPS in the RAID write penalty calculation. Calculated as follows:

 

IOPS × total number of physical disks in the disk physical disk IOPS =

 

Available IOPS = (total physical disk write IOPS × ÷ RAID write penalty percentage) + (Physical Disk Read IOPS × percentage of total)

 

     RAID-5 is assumed that the composition provide a total physical disk IOPS 500, the memory write proportion application is 50% / 50%, then for the front-end host, the actual available IOPS is:

 

(500 ×50% ÷ 4)+ ( 500 * 50%) = 312.5 IOPS

 

IOPS contrast of disk RAID10 / RAID5 configuration

Comparison of disk IOPS

Suppose a Case, business iops 10000, cache read hit rate is 30%, 60% iops read, write iops 40%, the number of the disk 120, the case where the values ​​are calculated in raid5 raid10 of each disk the iops how much.

raid5:

Monolithic disc iops = (10000 * (1-0.3) * 0.6 + 4 * (10000 * 0.4)) / 120

= (4200 + 16000)/120

= 168

Here 10000 * (1-0.3) * 0.6 representation is read iops, the ratio is 0.6, remove the cache hit, actually, only 4200 iops.

4 * (0.4 * 10 000) shows a write iops, because each write, in raid5, the four actual IO, so as to write iops 16000

In order to consider raid5 when a write operation, and that two read operations may hit occurs, the more accurate calculation is:

Monolithic disc iops = (10000 * (1-0.3) * 0.6 + 2 * (10000 * 0.4) * (1-0.3) + 2 * (10000 * 0.4)) / 120

= (4200 + 5600 + 8000)/120

= 148

Calculated as a single disk iops 148, reaches the disk limit

Radl0

Monolithic disc iops = (10000 * (1-0.3) * 0.6 + 2 * (10000 * 0.4)) / 120

= (4200 + 8000)/120

= 102

It can be seen as raid10 for a write operation, io occurs only twice, so that the same pressure, the same disk, each disk iops only 102, far below the limit iops disk.

 

A detailed comparison with RAID5 RAID10

For comparison, I am here to take as many disk drives do the comparison, RAID5 selection of 3D + 1P RAID scheme, RAID10 choice of 2D + 2D RAID scheme.

·        

RAID5 and RAID10 disk arrays are the current mainstream solutions they any different? Here, I'll run both the principle of internal RAID to analyze, under what circumstances we should choose which one is right for RAID.

For comparison, I am here to take as many disk drives do the comparison, RAID5 selection of 3D + 1P RAID scheme, RAID10 choice of 2D + 2D RAID scheme, respectively, as shown:


RAID5 selection of 3D + 1P RAID scheme, RAID10 RAID scheme selection of 2D + 2D

Then, we analyze the following three processes: reading, writing continuous, random writes, however, before the introduction of these three processes, I need to introduce a particularly important concept: cache.

cache technology in recent years, in disk storage technology, developed very quickly, as a high-end storage, cache is already at the heart of the entire store, is the low-end storage, there is also a great cache, including the most simple RAID card, generally they contain dozens or even hundreds of megabytes of RAID cache.

What is the main role cache is it? Reflected in two different aspects of reading and writing, as if writing is generally only required to write to the storage array cache even completed the write operation, the write array is very fast, in the write cache data accumulated to a certain extent, and only then the data array to disk brush, the batch writing can be achieved, as protected data cache, the mirror is generally dependent on the phase of the battery (or UPS).

cache read the same can not be ignored, as if reading to hit in the cache, it will seek to reduce the disk, because the disk from the beginning of the seek to find the data, generally in more than 6ms, and this time, for those of intensive io applications may not be ideal. However, if the cache can be hit, the general response time can be less than 1ms.

Do not superstitious storage vendors iops (number io per second) data, they might all be done on the basis of cache hits, but in fact, your cache hit rate may be only 10%.

Introduction to cache, we can explain RAID5 and RAID10 in different modes, work efficiency, then we have to analyze the above three questions.

1 read

Because RAID5 and RAID10 disk can provide services, and, in reading the above is basically no difference between them, except for reading data can affect the cache hit ratio, resulting in the hit rate is not the same.

 

2, sequential write

Continuous writing process, generally represent continuous data writing large quantities of data such as media streaming, large file, and so on, the write operation, if the write cache exists, and the algorithm is no problem, RAID5 than RAID10 even better (here assumed to be a certain size to store sufficient write cache, and calculate the checksum of the cpu will not bottleneck). Because this time the check is done in the cache, disk block as RAID5 4, you can calculate the checksum in memory simultaneously writes 3 data + 1 parity. RAID10 while simultaneously written only two data +2 mirror phase.

 

Sequential Write

For example, RAID5 disk 4 can be written to the cache at the same time, 2, 3, and after cache calculate the checksum, 6 is assumed here that I (actual checksum calculated is not the case, where only I hypothesis), while the three data written to disk. And four disk RAID10 cache regardless of whether there is, write, data are written simultaneously to two phases two mirrors.

But I have said before, the write cache that can cache write operations until a certain period and then written to disk, however, write better than read, write this sooner or later have to happen, that is, last fall written on the disk still can not be avoided, however, if not the continuity of strong sequential write, they do not reach the limit disk write, the difference is not too large.

3, discrete write

Here it is probably the most difficult to understand, however, is the most important part of the database, such as the oracle database is the most discrete write operations, as each write a block of data, such as 8K; on-line log may seem a sequential write, but because of the small amount each write, no guarantee can fill up a strip of RAID5 (to ensure that each disc can be written), so very often discrete written.


Discrete write

We followed the figure, should assume a 4 into a digital number 2, then for a RAID5, four times the actual IO, 2 and reading out the first check 6, and then calculate a new read hit in the cache may occur check 4 and the new digital writing new check 8

For RAID10, we can see that the same single operation, only two final RAID10 io, and RAID5 need four io.

But here I ignored those two RAID5 read operation time, may also read hit operation occurs, that is, if the data has to be read in the cache, it may not need four io, but also proved the importance of RAID5 cache, not just calculate the checksum needs, but also to enhance the performance of the important. Once tested, the RAID5 array, if you turn off the write cache, the performance of RAID5 will be a lot worse times.

Here, not to say that cache of RAID10 not important, because the write buffer read hit, are the key to improve the speed, but is, RAID10 RAID5 so obvious it is not dependent on the cache.

Here, we should also generally understand the difference between theory and RAID5 and RAID10 is, in general, like small io database type operation, we recommend using RAID10, and large file storage, data warehouse, from the point of view of space utilization, can using RAID5.

If you are a large number of transactional operations, in a typical OLTP environment, we consider RAID10 better, because we consider the main aspects of the OLTP environment, IO performance. For a typical data warehouse environment, OLAP environment, we select RAID5, because the space is, RAID5 more appropriate.

 

 

Three issues to consider regarding storage

3.1 uses more disk

 In RAID, disk speed is always a long time came faster than disk. E.g. 4TB volume required, the use of higher than 10 400G 2 hard 2TB drive performance is obtained. In some cases the throughput is approximately equal to the sum of the individual disk throughput. 400G and 2T hypotheses consistent speed for the 20M / S, then the use of a combination of a total throughput of 400G 200M / S, 2TB combination 40M / S. 400G 2TB combination is a combination of five times.

3.2 high-speed hard disk

 No doubt, high-speed hard disk has better performance than the low-speed hard drive. 15K is the theoretical speed of the hard disk 250M / S, in fact, non-sequential, random read speed actually 2-4M / S, sequential bulk reading speed will not exceed 50-60M / S.

3.3 disk test

  You can use windows built-in windows system performance monitor, or use IOMeter three parameters on the line Avg sec / Read, Avg sec / Write, Avg sec / Transfer calculator to measure.

 Database transaction log: no more than 5MS, best 0ms.

 OLTP data: less than 10MS.

 OLAP and reporting systems: no more than 25MS.

 

Physical Disk IO monitoring of delay

Physical Disk IO on the delay analysis on the Windows level, mainly depends on the Performance Monitor counters to measure physical Disk IO delay counters are mainly three:

 

Avg Disk sec / Transfer:. Disk each access operation using the average time

Avg Disk sec / Read:. Disk each read operation using the average time

The average time each write Disk used: Avg Disk sec / Write.

avg.Disk sec / (Transfer, Read, Write), can well reflect the rate of Disk IO, the IO speed recommended baseline measure of Disk (baseline):

 

Soon: <10ms

General: 10-20ms

A bit slow: 20-50ms

Very slow:> 50ms

 

Published 37 original articles · won praise 0 · Views 2423

Guess you like

Origin blog.csdn.net/syjhct/article/details/85645261