MySQL (InnoDB analysis): performance tuning (choice of CPU, importance of memory, impact of disk on database performance)

Original link

First, the choice of CPU

数据库的应用类型可分为两大类:OLTP(Online Transaction Processing,在线事务处理)和OLAP(Online analytical Processing,在线分析处理)。这是两种截然不同的数据库应用:
OLAP多用在数据仓库或数据集市中,一般需要执行复杂的SQL语句来进行查询
OLTP多用在日常的事物处理应用中,如银行交易、在线商品交易、Blog、网络游戏等应用。相对于OLAP,数据库的容量较小

The InnoDB storage engine is generally used in OLTP database applications. The characteristics of this application are as follows:

  • A large amount of concurrent user operations
  • Transaction processing time is generally shorter
  • The query statement is relatively simple, and the index is generally used
  • less complex queries

It can be seen that the OLTP database application itself does not have very high CPU requirements, because complex queries may need to perform very CPU-consuming operations such as comparison, sorting, and connection. These operations rarely occur in OLTP database applications. Therefore, it can be said that OLAP is a CPU-intensive operation, while OLTP is an IO-intensive operation. It is recommended that more attention should be paid to improving the configuration of IO when purchasing equipment. In addition, in order to obtain
more memory support, users The purchased CPU must support 64-bit, otherwise the installation of 64-bit operating system cannot be supported. Therefore, choosing a 64-bit CPU for new applications is a necessary prerequisite. Now 4-core CPUs are very common, and now Intel and AMD have launched 8-core CPUs one after another. With the upgrade of the operating system in the future, we may also see 128-core CPUs, which require better database support. .
From the perspective of the design architecture of the InnoDB storage engine:
its main background operations are completed in a separate master thread, so it cannot support multi-core applications well. Of course, the
open source community has used various methods to change this. situation, and InnoDB1.0 version has shown that the support for multi-core CPU processing performance has been greatly improved under various tests,
while InnoDB1.2 version supports multiple purge threads and separates the refresh operation from the master thread Therefore
, if the user's CPU supports multi-core, the version of InnoDB should be 1.1 or higher. In addition, if the CPU is multi-core, you can increase the number of IO threads by modifying the parameters innodb_read_io_threads and innodb_write_io_threads, which can also fully and effectively utilize the multi-core performance of the CPU. work in CPU
, does not support multi-CPU processing. OLTP database application operations are generally very simple, so the impact on OLTP applications is not great. However, multiple CPUs or multi-core CPUs are still helpful for handling large concurrent requests

2. The importance of memory

The size of the memory is the most direct reflection of the performance of the database . Through the previous introduction, we have learned that the InnoDB storage engine not only caches data, but also caches indexes, and caches them in a large buffer pool, namely InnoDB Buffer Pool. Therefore, the size of the memory directly affects the performance of the database.
Performance test
Vadin, CTO of Percona, conducted a test to reflect the importance of memory, and the results are shown in the following figure:
test

In the above test, the total size of the data and index is 18GB , and then the size of the buffer pool is set to 2GB, 4GB, 6GB, 8GB, 10GB, 12GB, 14GB, 16GB, 18GB, 20GB, 22GB, and then the sysbench test
can Discover:

  • As the buffer pool increases, the test result TPS (Transaction Per Second) will increase linearly
  • When the buffer pool is increased to 20GB and 2GB, the performance of the database has been greatly improved, because the size of the buffer pool is larger than the size of the data file itself, and all operations on the data file can be performed in memory. Therefore, the performance at this time should be optimal, and increasing the size of the buffer pool will not improve the performance of the database.

Therefore, you should estimate the size of the "active" database before developing the application, and use this to determine the size of the database server memory. Of course, if you want to use more memory, you must use a 64-bit operating system.
How to judge whether the memory of the current database has reached the bottleneck? You can judge the hit rate of the buffer pool by viewing the current server status and comparing the ratio of physical disk reads and memory reads. Usually, the hit rate of the buffer pool of the InnoDB storage engine should not be less than 99%, such as:

insert image description here

The specific meanings of the above parameters are as follows:
insert image description here

The hit rate of the InnoDB cache pool can be calculated by the following formula:
insert image description here

As can be seen from the above example, the buffer pool hit rate=167051313/(167051313+129236+0)=99.92%
If the hit rate is too low, you should consider expanding the memory and increasing the value of innodb_buffer_pool_size. innodb_buffer_pool_size=innodb_buffer_pool_chunk_size * innodb_buffer_pool_instances * N
where N is an integer, if the value of innodb_buffer_pool_size is not the value of this formula, MySQL will automatically adjust

Even if the size of the buffer pool is already larger than the size of the database file, it does not mean that there are no disk operations. The buffer pool of the database is just an area used to store hot spots, and the background thread is also responsible for asynchronously writing dirty pages to disk. In addition, the log needs to be written to the redo log file every time the transaction commits

3. The impact of disk on database performance

  • Traditional mechanical hard disk
    Most databases currently use traditional mechanical hard disks. The technology of mechanical hard disks is very mature at present, and hard disks with SAS or SATA interfaces are generally used in the server field. Server mechanical hard drives have begun to transform to miniaturization, and most of them currently use 25-inch SAS mechanical hard drives.
    Mechanical hard drives have two important indicators: one is the seek time, and the other is the rotational speed. The seek time of the current server mechanical hard disk can reach 3ms, and the speed is 15000RM (rotate per minute). The biggest problem with traditional mechanical hard disks is the read-write head. The design of the read-write head makes the hard disk no longer like a magnetic tape, which can only be accessed sequentially, but can be accessed randomly. However, the access of the mechanical hard disk requires a long time of head rotation and positioning to find, so the speed of sequential access is much higher than that of random access. Many designs of traditional relational databases also try to make full use of the characteristics of sequential access.
    Generally speaking, multiple mechanical hard disks can be formed into RAID to improve the performance of the database, and data files can also be distributed on different hard disks to achieve access load. balanced.
  • Solid-state hard disk Solid-
    state hard disk, more precisely, flash memory-based solid-state hard disk, is a new storage device that has emerged in recent years, and its interior is composed of flash memory (Flash Memory). Because of the low latency, low power consumption, and shock resistance of flash memory, flash memory devices have been widely used in mobile devices. Enterprise-level applications generally use solid-state drives, and the throughput of data transmission is further improved by connecting multiple flash memories in parallel. EMC, a traditional storage service provider, has begun to provide TB-level storage solutions based on flash-based solid-state drives. Database vendor Oracle has also recently begun offering Exadata servers bound to solid-state drives.
    Unlike traditional mechanical hard drives, flash memory is a complete electronic device without the read/write heads of traditional mechanical hard drives. Therefore, SSDs do not require time-consuming head rotation and positioning to find data like traditional mechanical hard disks, so SSDs can provide consistent random access time. The characteristics of fast reading and writing and positioning of data in solid state drives are worth studying.
    On the other hand, the data in the flash memory cannot be updated and can only be overwritten by sectors (sectors). Before overwriting and rewriting, a very time-consuming erasing (erase) operation needs to be performed. The erasing operation cannot be completed on the sector containing the data, but needs to be completed on the basis of deleting the entire so-called erase block. The size of this erase block is larger than the size of the sector, usually 128KB or 256KB. In addition, each erase block has a limit on the number of times it can be erased. There are already some algorithms to solve this problem. However, for database applications, it is necessary to seriously consider the writing problems of SSDs.
    Because of the above-mentioned writing problems, the read and write speeds provided by flash memory are asymmetrical. The reading speed is much faster than the writing speed. Therefore, for the application of solid-state drives in databases, you should make good use of their reading performance to avoid excessive writing operations. The following figure shows
    a dual-channel solid-state drive architecture. By supporting 4-way flash memory interleaved storage, the access delay of the solid-state hard disk is reduced, and the concurrent read and write operations are increased at the same time. By further increasing the number of channels, the performance of solid-state drives can be linearly improved, for example, our common Intel X25M solid-state drives are 10-channel solid-state drives
    insert image description here

Since flash memory is a complete electronic device without moving parts such as read and write heads, solid-state drives have low access latency. When the host issues a read and write request, the controller of the SSD will map the IO command from the logical address to the actual physical address, and the write operation also needs to modify the corresponding mapping table information. Counting these additional overheads, the access delay of SSDs is generally less than about 0.1ms. The figure below shows the comparison between the random access latency of traditional mechanical hard disk, memory, and solid-state disk
insert image description here

For the optimization of solid-state drives in the InnoDB storage engine, you can increase the value of the innodb_io_capacity variable to make full use of the high IOPS features brought by solid-state drives. However, this requires users to make targeted adjustments according to their own applications. In InnoSQL and InnoDB1.2 versions, you can choose to turn off the refresh of adjacent pages, which can also improve the performance of the database to a certain extent. In addition, you can also
use the L2 Cache solution developed by InnoSQL, which can make full use of solid-state drives. Ultra-high-speed random read performance, a secondary buffer pool based on flash solid-state disk is established between the memory buffer pool and the traditional storage layer, so as to expand the capacity of the buffer pool and improve the performance of the database. Solutions similar to the disk-based SSD Cache include Facebook Flash Cache and bcache, but they are based on a general-purpose file system and have less optimization for the InnoDB storage engine itself.

Guess you like

Origin blog.csdn.net/baidu_38956956/article/details/128453402