MySQL performance optimization

Today, database operations are increasingly becoming the performance bottleneck of the entire application, especially for Web applications. Regarding database performance, this is not just something DBAs need to worry about, it is something that we programmers need to pay attention to. When we design the database table structure, we need to pay attention to the performance of data operation when operating the database (especially the SQL statement when looking up the table).

1. Optimize your queries for query caching

Most MySQL servers have query caching turned on. This is one of the most efficient ways to improve performance, and this is handled by MySQL's database engine. When many identical queries are executed multiple times, the results of these queries will be placed in a cache, so that subsequent identical queries will not need to operate the table and directly access the cached results.

The main problem here is that for programmers, this thing is easy to ignore. Because, some of our query statements will make MySQL not use the cache. See the example below:

write picture description here

The difference between the above two SQL statements is CURDATE(), MySQL's query cache does not work for this function. Therefore, SQL functions like NOW() and RAND() or other such functions will not enable query caching, because the return of these functions will be volatile and volatile. So, all you need is to replace the MySQL function with a variable to enable caching.

2. EXPLAIN your SELECT query

Using the EXPLAIN keyword lets you know how MySQL handles your SQL statement. This can help you analyze the performance bottleneck of your query statement or table structure.

EXPLAIN query results will also tell you how your index primary keys are used, how your data tables are searched and sorted...etc, etc.

Pick one of your SELECT statements (recommended to pick the most complex one with multi-table joins) and add the keyword EXPLAIN to the front. You can use phpmyadmin to do this. Then, you will see a table. In the example below, we forgot to add the group_id index and have a table join:

write picture description here

When we index the group_id field:
  
write picture description here

We can see that the former result shows that 7883 rows were searched, while the latter only searches 9 and 16 rows of both tables. Looking at the rows column allows us to find potential performance issues.

3. Use LIMIT 1 when only one row of data is required

When you query the table for some time, you already know that the result will only have one result, but because you may need to fetch the cursor, or you may check the number of records returned.

In this case, adding LIMIT 1 can increase performance. In this way, the MySQL database engine will stop searching after finding a piece of data, instead of continuing to search for the next matching record.

The following example is just to find out if there are "China" users. Obviously, the latter will be more efficient than the former. (Note that the first entry is Select * and the second entry is Select 1)

write picture description here

4. Index the search field

Indexes are not necessarily for primary keys or unique fields. If there is a field in your table that you will always use for searching, please create an index for it.

write picture description here

From the above figure, you can see that the search string "last_name LIKE 'a%'", one is indexed, the other is no index, and the performance is about 4 times worse.

In addition, you should also need to know what kind of search can not use the normal index. For example, when you need to search for a term in a large article, such as: "WHERE post_content LIKE '%apple%'", indexing may not make sense. You may need to use MySQL full-text indexing or make an index yourself (for example: search for keywords or tags)

5. Use the equivalent type of instance when joining the table and index it

If your application has many JOIN queries, you should make sure that the join fields in both tables are indexed. In this way, MySQL will start the mechanism to optimize the Join SQL statement for you.

Moreover, these fields used for Join should be of the same type. For example: if you want to join a DECIMAL field with an INT field, MySQL will not be able to use their indexes. For those STRING types, you also need to have the same character set. (The character sets of the two tables may be different)

write picture description here

6. Never ORDER BY RAND()

Want to shuffle the rows of data returned? Pick one at random? I don't know who invented this usage, but many newbies love it. But you really don't understand how terrible performance problems this is.

If you really want to shuffle the returned rows of data, you have N ways to achieve this. Using it this way only degrades the performance of your database exponentially. The problem here is: MySQL will have to execute the RAND() function (which is very CPU-intensive), and this is for each row to be recorded and then sorted. Even if you use Limit 1 it won't help (because of sorting)

The following example picks a record at random:

write picture description here

7. Avoid SELECT *

The more data is read from the database, the slower the query becomes. And, if your database server and web server are two separate servers, it will also increase the load of network transmission.

So, you should develop a good habit of taking whatever you need.

8. Always set an ID for each table

We should set an ID as its primary key for each table in the database, and the best one is an INT type (UNSIGNED is recommended), and set the automatically increased AUTO_INCREMENT flag.

Even if your users table has a field with a primary key called "email", don't make it the primary key. Using the VARCHAR type as the primary key will degrade performance. Also, in your program, you should use the ID of the table to construct your data structure.

Moreover, under the MySQL data engine, there are still some operations that need to use the primary key. In these cases, the performance and settings of the primary key become very important, such as clustering, partitioning...

Here, there is only one exception, that is the "foreign key" of the "associated table", that is, the primary key of this table is formed by the primary keys of several individual tables. We call this situation a "foreign key". For example, if there is a "student table" with the student's ID, and a "course table" with the course ID, then the "grade table" is the "association table", which is related to the student table and the curriculum table. In the grade table, Student ID and course ID are called "foreign keys" and together they form the primary key.

9. Use ENUM instead of VARCHAR

The ENUM type is very fast and compact. In reality, it holds a TINYINT, but appears to be a string. In this way, using this field to do some list of options becomes quite perfect.

If you have a field, such as "gender", "country", "nation", "state" or "department", you know the value of these fields is limited and fixed, then you should use ENUM instead of VARCHAR.

MySQL also has a "suggestion" (see item 10) that tells you how to reorganize your table structure. When you have a VARCHAR field, this suggestion will tell you to change it to an ENUM type. Using PROCEDURE ANALYSE() you can get relevant advice.

10. Get advice from PROCEDURE ANALYSE()

PROCEDURE ANALYSE() will let MySQL analyze your fields and their actual data for you, and give you some useful suggestions. These suggestions will only be useful if there is actual data in the table, because making some big decisions requires data as a basis.

For example, if you create an INT field as your primary key, but don't have much data, PROCEDURE ANALYSE() will suggest that you change the field's type to MEDIUMINT. Or you're using a VARCHAR field because there isn't much data, and you might get a suggestion to change it to ENUM. These suggestions may be due to insufficient data, so the decision-making is not accurate enough.

In phpmyadmin you can view these suggestions by clicking on "Propose table structure" when viewing the table

write picture description here

It's important to note that these are just suggestions and will only become accurate as you have more and more data in your tables. Remember, you are the one who makes the final decision.

11. Use NOT NULL whenever possible

Unless you have a very specific reason to use NULL values, you should always keep your fields NOT NULL. This may seem a bit controversial, so read on.

First, ask yourself how different is "Empty" and "NULL" (or 0 and NULL if it's an INT)? If you don't think there's any difference between them, then you don't use NULL. (Did you know? In Oracle, NULL and Empty strings are the same!)

Don't assume that NULL doesn't need space, it needs extra space, and your program will be more complicated when you compare. Of course, this does not mean that you can't use NULL. The reality is very complicated. There are still some cases where you need to use NULL.

12. Prepared Statements

Prepared Statements, much like stored procedures, are a collection of SQL statements that run in the background. We can get a lot of benefits from using prepared statements, whether it's a performance issue or a security issue.

Prepared Statements can check some of your bound variables, which can protect your program from "SQL injection" attacks. Of course, you can also manually check your variables, however, manual checks are prone to problems and are often forgotten by programmers. When we use some framework or ORM, such problems will be better.

In terms of performance, this can give you a considerable performance advantage when the same query is used multiple times. You can define some parameters for these Prepared Statements, and MySQL will only parse them once.

Although the latest version of MySQL uses a binary format for transmitting Prepared Statements, this makes network transmission very efficient.

Of course, there are also cases where we need to avoid using Prepared Statements because they do not support query caching. But it is said that it is supported after version 5.1.

To use prepared statements in PHP, you can check its manual: mysqli extension or use a database abstraction layer such as: PDO

write picture description here

13. Unbuffered queries

Under normal circumstances, when you execute an SQL statement in your script, your program will stop there until the SQL statement returns, and then your program will continue to execute. You can use unbuffered queries to change this behavior.

mysql_unbuffered_query() sends an SQL statement to MySQL without automatically fetching and caching the results like mysql_query(). This saves a lot of memory, especially for queries that generate a lot of results, and you don't need to wait until all the results are returned, just the first row of data is returned, and you can start working on it right away. The query result is up.

However, this has some limitations. Because you either read all the rows, or you call mysql_free_result() to clear the results before doing the next query. Also, mysql_num_rows() or mysql_data_seek() will not work. So, you need to think carefully about whether to use unbuffered queries.

14. Save IP address as UNSIGNED INT

Many programmers create a VARCHAR(15) field to hold the IP as a string instead of an integer. If you use integer to store, only 4 bytes are needed, and you can have fixed length fields. Also, this will give you a query advantage, especially if you need to use a WHERE condition like this: IP between ip1 and ip2.

We must use UNSIGNED INT because IP addresses use the entire 32-bit unsigned integer.

And your query, you can use INET_ATON() to convert a string IP to an integer, and use INET_NTOA() to convert an integer to a string IP. In PHP, there are also such functions ip2long() and long2ip().

write picture description here

15. Fixed-length tables are faster

If all fields in a table are "fixed-length", the entire table is considered "static" or "fixed-length". For example, there are no fields of the following types in the table: VARCHAR, TEXT, BLOB. As long as you include one of these fields, the table is not a "fixed-length static table", so the MySQL engine will handle it in a different way.

Fixed-length tables will improve performance, because MySQL will search faster, because these fixed lengths are easy to calculate the offset of the next data, so the read will naturally be fast. And if the field is not fixed length, then, every time you want to find the next item, you need the program to find the primary key.

Also, fixed-length tables are easier to cache and rebuild. However, the only side effect is that fixed-length fields will waste some space, because fixed-length fields are allocated that much space whether you use them or not.

Using the "vertical split" technique (see next item), you can split your table into two, one of fixed length and one of variable length.

16. Vertical Split

"Vertical splitting" is a method of changing the table in the database into several tables by column, which can reduce the complexity of the table and the number of fields, so as to achieve the purpose of optimization. (Before, I did a project in a bank, and I saw a table with more than 100 fields, which was horrible)

Example 1: There is a field in the Users table that is home address. This field is an optional field. Compared to this, you don't need to read or rewrite this field frequently except for personal information when you are operating in the database. So, why not put it in another table? This will make your table have better performance, let's think about it, a lot of times, I only have user ID, user name, password for the user table , user roles, etc. are often used. A smaller table will always have good performance.

Example 2: You have a field called "last_login" which is updated every time the user logs in. However, each update causes the table's query cache to be emptied. So, you can put this field in another table, so it won't affect your constant reading of user ID, user name, user role, because the query cache will help you increase a lot of performance.

In addition, what you need to pay attention to is that the table formed by these separated fields, you will not join them frequently, otherwise, the performance will be worse than when not divided, and it will be a pole number grade decline.

17. Split large DELETE or INSERT statements

If you need to perform a large DELETE or INSERT query on an online site, you need to be very careful not to cause your entire site to stop responding. Because these two operations will lock the table, once the table is locked, no other operations can enter.

Apache will have many child processes or threads. Therefore, it works quite efficiently, and our server does not want to have too many child processes, threads and database links, which are a huge amount of server resources, especially memory.

If you lock your table for a period of time, say 30 seconds, then for a site with a high traffic volume, the accumulated access processes/threads, database links, number of open files in these 30 seconds may not only It will only let you park the WEB service Crash, and it may also cause your entire server to hang up immediately.

So, if you have a big deal, and you definitely want to split it up, using a LIMIT condition is a good approach. Here is an example:

write picture description here

18. Smaller columns are faster

For most database engines, disk operations are probably the most significant bottleneck. So, compacting your data can be very helpful in this situation, as it reduces access to the hard drive.

See MySQL's documentation on Storage Requirements for all data types.

If a table has only a few columns (such as dictionary table, configuration table), then we have no reason to use INT as the primary key, it will be more economical to use MEDIUMINT, SMALLINT or smaller TINYINT. If you don't need to keep track of time, using DATE is much better than DATETIME.

Of course, you also need to leave enough space for expansion, otherwise, you will die ugly if you do this in the future, see the example of Slashdot (November 06, 2009), a simple ALTER TABLE statement takes 3 More than an hour, because there are 16 million data in it.

19. Choose the right storage engine

There are two storage engines MyISAM and InnoDB in MySQL, each with pros and cons. Cool shell's previous article "MySQL: InnoDB or MyISAM?" discussed and this matter.

MyISAM is suitable for some applications that require a lot of queries, but it is not very good for a lot of write operations. Even if you only need to update a field, the entire table will be locked, and other processes, even the read process, cannot operate until the read operation is completed. Also, MyISAM is super fast for calculations like SELECT COUNT(*).

InnoDB tends to be a very complex storage engine, and for some small applications, it will be slower than MyISAM. He is that it supports "row lock", so it will be better when there are more write operations. And, he also supports more advanced applications, such as: transactions.

Below is the manual for MySQL

target=”_blank”MyISAM Storage Engine
InnoDB Storage Engine

20. Use an Object Relational Mapper

With ORM (Object Relational Mapper), you can get reliable performance gains. Everything an ORM can do can also be written manually. However, this requires a high-level expert.

The most important thing about ORM is "Lazy Loading", that is to say, it will only actually do it when it is needed to get the value. But you also need to be careful about the side-effects of this mechanism, as it will likely degrade performance by creating many, many small queries.

The ORM can also package your SQL statements into a transaction, which is much faster than executing them individually.

Currently, my personal favorite ORM for PHP is: Doctrine.

21. Beware of “permalinks”

The purpose of "permalinks" is to reduce the number of times the MySQL link is recreated. When a link is created, it stays connected forever, even after the database operation has ended. Also, since our Apache started to reuse its child process - that is, the next HTTP request will reuse the Apache child process and reuse the same MySQL link.

PHP Manual: mysql_pconnect()

In theory, this sounds pretty good. But from personal experience (and most people's), this feature creates more trouble. Because, you only have a limited number of links, memory issues, number of file handles, etc.

Also, Apache runs in an extremely parallel environment, creating many, many processes. That's why this mechanism of "permalinks" doesn't work well. Before you decide to use "permalinks", you need to think about the architecture of your entire system.

(Retrieved from https://searchdatabase.techtarget.com.cn/7-18321/ )

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325488445&siteId=291194637