Performance and Scalability Optimizations for MySQL Large-Scale Databases

In an era that needs to support mobile/tablet applications as well as common desktop browser access, the popularity and effectiveness of a website depends largely on its usability and performance. A slow-to-reach website can churn out visitors or potential customers and lead to business failure. A reasonably fast website will determine whether visitors will use the product or service offered by the website.

Websites with massive databases always require proper attention, configuration, optimization, tuning and maintenance to ensure fast loading of the website. This article will discuss how to optimize MySQL databases with huge amounts of data.

Choosing InnoDB as the storage engine

The database of large-scale products has high requirements on reliability and concurrency. As the default MySQL storage engine, InnoDB is a better choice than MyISAM.

Optimize database structure

  • Organize your database schema, tables, and fields to reduce I/O overhead, keep related items together, and plan ahead so that performance can remain high as your data volume grows.
  • The design data table should try to minimize the space occupied by it, and the primary key of the table should be as short as possible.
  • For InnoDB tables, the column on which the primary key resides is replicable in every secondary index entry, so a short primary key can save a lot of space if there are many secondary indexes.
  • Only create indexes that you need to improve query performance. Indexes facilitate retrieval, but increase the execution time of insert and update operations.

InnoDB's Change Buffering Feature

InnoDB provides a change buffering configuration that reduces the disk I/O required to maintain secondary indexes. Large-scale databases may experience heavy table operations and heavy I/O to keep secondary indexes up-to-date. When the relevant page is not in the buffer pool, InnoDB's change buffer will change the cache to the secondary index entry, thus avoiding time-consuming I/O operations due to pages that cannot be immediately read from disk. Buffered changes are merged as pages are loaded into the buffer pool, and updated pages are then flushed to disk. Doing so improves performance and works with MySQL 5.5 and later.

InnoDB page compression

InnoDB supports page-level compression of tables. When a data page is written, it is compressed by a specific compression algorithm. The compressed data is written to disk, and its hole punching mechanism frees up empty blocks at the end of the page. If compression fails, data is written as-is. Both tables and indexes are compressed, and because indexes are usually a large portion of the total database size, compression can significantly save memory, I/O, or processing time, thus improving performance and scalability. It also reduces the amount of data transferred between memory and disk. This feature is supported in MySQL 5.1 and later.

Note that page compression does not support tables in shared tablespaces. Shared tablespaces include system tablespaces, temporary tablespaces, and regular tablespaces.

Using Bulk Data Import Bulk data import

using a sorted data source on the primary key speeds up the data insertion process. Otherwise, rows may need to be inserted between other rows to maintain ordering, which can lead to high disk I/O, which in turn affects performance and increases page splits. Turning off autocommit mode is also beneficial, as it performs log flushes to disk for each insert. Temporarily shifting unique key and foreign key checks during bulk inserts can also significantly reduce disk I/O. For newly created tables, the best practice is to create foreign key/unique key constraints after bulk import.

SQL statement optimization

In order to improve the speed of the query, you can add an index to the column used in the WHERE clause. Also, don't use a primary key index for too many or too long columns, as these column values ​​can increase the I/O resources required for reads and take up cache when replicated by the secondary index.

If an index contains unnecessary data, reading this data through I/O and caching it can reduce server performance and scalability. Also don't use a unique key index for unnecessary columns as it disables change buffering. A regular index should be used instead.

Reduce and isolate time-consuming function calls.

Minimize the number of full table scans in the query as much as possible.

Adjust the size and properties of cache areas, such as InnoDB buffer pools, MySQL query cache, etc., to make repetitive queries faster by fetching data from memory instead of disk.

Optimizing Storage Structures

For large tables, or tables that contain a lot of repetitive text or numeric data, you should consider using the COMPRESSED (compressed) row format. This requires less I/O to fetch data to the buffer pool, or perform a full table scan.

Once your data reaches a stable size, or a growing table adds tens or hundreds of megabytes, you should consider using the OPTIMIZE TABLE statement to reorganize the table and compact wasted space. A full table scan of the reorganized table requires less I/O.

Optimizing InnoDB Disk I/O

Increasing the InnoDB buffer pool size allows queries to be accessed from the buffer pool rather than through disk I/O. The flushing-buffering metric can be adjusted to the optimum level by adjusting the system variable innodb_flush_method.

Configure RAID with other storage devices.

Memory allocation

for MySQL Before allocating enough memory for MySQL, consider the memory requirements for MySQL in different areas.

The key areas to consider are: Concurrent connections - For a large number of concurrent connections, sorting and temp tables will require a lot of memory. As of this writing, 16GB to 32GB of RAM is sufficient for a database handling 3000+ concurrent connections.

Memory fragmentation can consume about 10% or more of memory. Caches and buffers like innodb_buffer_pool_size, key_buffer_size, query_cache_size, etc. consume about 80% of the allocated memory.

Routine maintenance

regularly checks slow query logs and optimizes query mechanisms to efficiently use cache to reduce disk I/O. Optimize them to scan the minimum number of rows instead of doing a full table scan.

Other logs that can help DBAs check and analyze performance include: error logs, general query logs, binary logs, DDL logs (metadata logs).

Regularly flush caches and buffers to reduce fragmentation. Use the OPTIMIZE TABLE statement to reorganize the table and compact any space that might be wasted.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326262569&siteId=291194637