Mysql-about SQL additions, deletions, modifications and optimization of table creation

Mysql-Optimization of SQL additions, deletions and modifications

①Insert data in bulk

If you perform a large number of inserts at the same time, it is recommended to use the INSERT statement with multiple values ​​(Method 2). This is faster than using separate INSERT statements (method one). Generally, the efficiency of batch insertion is several times different.

method one:

insert into T values(1,2); 
 
insert into T values(1,3); 
 
insert into T values(1,4);

Method Two:

Insert into T values(1,2),(1,3),(1,4); 

There are three reasons for choosing the latter method:

Reduce SQL statement parsing operations. MySQL does not have a share pool similar to Oracle. Using method two, data can be inserted only by parsing once.

In certain scenarios, the number of DB connections can be reduced.

The SQL statement is shorter, which can reduce the IO of network transmission.

② Use commit appropriately

Appropriate use of commit can release the resources occupied by the transaction and reduce consumption. The resources that can be released after commit are as follows:

事务占用的 undo 数据块。

事务在 redo log 中记录的数据块。

释放事务施加的,减少锁争用影响性能。特别是在需要使用 delete 删除大量数据的时候,必须分解删除量并定期 commit。

③ Avoid repeated query of updated data

In response to the need to update rows that often appear in the business and want to obtain row change information, MySQL does not support the UPDATE RETURNING syntax like PostgreSQL, which can be implemented through variables in MySQL.

For example, update the timestamp of a row of records, and want to query what is the timestamp stored in the current record?

Simple method to achieve:

Update t1 set time=now() where col1=1; 

Select time from t1 where id =1;

The use of variables can be rewritten as follows:

Update t1 set time=now () where col1=1 and @now: = now (); 

Select @now; 

Both the front and back require two network round trips, but the use of variables avoids re-visiting the data table, especially when the t1 table has a large amount of data, the latter is much faster than the former.

④Query priority or update (insert, update, delete) priority

MySQL also allows changing the priority of statement scheduling, which can make queries from multiple clients work better together so that a single client does not wait for a long time due to locks. Changing the priority can also ensure that certain types of queries are processed faster.

We should first determine the type of application, determine whether the application is query-based or update-based, whether to ensure query efficiency or update efficiency, and decide whether to query priority or update priority.

The method of changing the scheduling strategy mentioned below is mainly for storage engines that only have table locks, such as MyISAM, MEMROY, ​​and MERGE. For Innodb storage engines, the execution of statements is determined by the order in which row locks are obtained.

MySQL's default scheduling strategy can be summarized as follows:

Write operations take precedence over read operations.

A write operation to a certain data table can only occur once at a certain time, and write requests are processed in the order in which they arrive.

Multiple read operations of a certain data table can be performed simultaneously.

MySQL provides several statement modifiers that allow you to modify its scheduling strategy:

The LOW_PRIORITY keyword applies to DELETE, INSERT, LOAD DATA, REPLACE, and UPDATE.

The HIGH_PRIORITY keyword applies to SELECT and INSERT statements.

The DELAYED keyword applies to INSERT and REPLACE statements.

If the write operation is a LOW_PRIORITY (low priority) request, then the system will not consider its priority higher than the read operation.

In this case, if the second reader arrives while the writer is waiting, then the second reader is allowed to insert before the writer.

Writers are allowed to start operations only when there are no other readers. This kind of scheduling modification may cause the LOW_PRIORITY write operation to be blocked forever.

The HIGH_PRIORITY (high priority) keyword of the SELECT query is similar. It allows the SELECT to be inserted before the waiting write operation, even if the priority of the write operation is higher under normal circumstances.

Another effect is that high-priority SELECTs are executed before normal SELECT statements, because these statements will be blocked by write operations.

If you want all statements that support the LOW_PRIORITY option to be processed as low priority by default, use the –low-priority-updates option to start the server.

By using INSERTHIGH_PRIORITY to increase the INSERT statement to the normal write priority, the effect of this option on a single INSERT statement can be eliminated.

Query condition optimization

①For complex queries, you can use intermediate temporary tables to temporarily store data

②Optimize the group by statement

By default, MySQL will sort all the values ​​of the GROUP BY group, such as "GROUP BY col1, col2,...;" The query method is the same as specifying "ORDER BY col1, col2,...;" in the query.

If you explicitly include an ORDER BY clause that contains the same columns, MySQL can optimize it without slowing down, even though it is still sorted.

Therefore, if the query includes GROUP BY but you do not want to sort the grouped values, you can specify ORDER BY NULL to prohibit sorting.

E.g:

SELECT col1, col2, COUNT(*) FROM table GROUP BY col1, col2 ORDER BY NULL ;

③Optimize the join statement

In MySQL, you can use the SELECT statement to create a single-column query result through a subquery, and then use this result as a filter condition in another query.

Using subqueries can complete many SQL operations that logically require multiple steps to complete at one time. At the same time, it can also avoid transaction or table locks, and it is easy to write. However, in some cases, sub-queries can be replaced by more efficient joins (JOIN)...

Example: Suppose you want to take out all users who have no order records, you can use the following query to complete:

SELECT col1 FROM customerinfo WHERE CustomerID NOT in (SELECT CustomerID FROM salesinfo )

If you use JOIN... to complete this query, the speed will be improved.

Especially when there is an index on CustomerID in the salesinfo table, the performance will be better. The query is as follows:

SELECT col1 FROM customerinfo 
   LEFT JOIN salesinfoON customerinfo.CustomerID=salesinfo.CustomerID 
      WHERE salesinfo.CustomerID IS NULL

JOIN... The reason why it is more efficient is that MySQL does not need to create a temporary table in memory to complete this logical two-step query.

④Optimize union query

MySQL executes union queries by creating and populating temporary tables. Unless you really want to eliminate duplicate rows, union all is recommended .

The reason is that if there is no keyword all, MySQL will add the distinct option to the temporary table, which will result in the uniqueness of the data in the entire temporary table, which is quite expensive.

Efficient:

SELECT COL1, COL2, COL3 FROM TABLE WHERE COL1 = 10 
 
UNION ALL 
 
SELECT COL1, COL2, COL3 FROM TABLE WHERE COL3= 'TEST';

Inefficient:

SELECT COL1, COL2, COL3 FROM TABLE WHERE COL1 = 10 
 
UNION 
 
SELECT COL1, COL2, COL3 FROM TABLE WHERE COL3= 'TEST';

⑤Split complex SQL into multiple small SQLs to avoid big transactions

as follows:

Simple SQL is easy to use MySQL's QUERY CACHE.

Reduce the lock table time, especially the table using the MyISAM storage engine.

Multi-core CPU can be used.

⑥ Use truncate instead of delete

When deleting records in the entire table, operations using the delete statement will be recorded in the undo block, and deleting records will also be recorded in binlog.

当确认需要删除全表时,会产生很大量的 binlog 并占用大量的 undo 数据块,此时既没有很好的效率也占用了大量的资源。

使用 truncate 替代,不会记录可恢复的信息,数据不能被恢复。也因此使用 truncate 操作有其极少的资源占用与极快的时间。另外,使用 truncate 可以回收表的水位,使自增字段值归零。

⑦ Use reasonable paging methods to improve paging efficiency

Use a reasonable paging method to improve the efficiency of paging For display and other paging needs, a proper paging method can improve the efficiency of paging.

Case 1:

select * from t where thread_id = 10000 and deleted = 0 
   order by gmt_create asc limit 0, 15;

In the above example, all fields are sorted and returned according to the filter conditions at one time. Data access overhead = index IO + table data IO corresponding to all index records.

Therefore, the more this writing method is turned over, the worse the execution efficiency and the longer the time, especially when the amount of table data is large.

Applicable scenarios: Applicable when the intermediate result set is small (less than 10,000 rows) or the query conditions are complex (referring to multiple different query fields or multi-table joins).

Case 2:

select t.* from (select id from t where thread_id = 10000 and deleted = 0
   order by gmt_create asc limit 0, 15) a, t 
      where a.id = t.id;

The above example must satisfy that the primary key of the t table is the id column, and there is a secondary key of the covering index: (thread_id, deleted, gmt_create).

Take out the primary key id based on the filter condition using the covering index for sorting, and then perform the join operation to take out other fields.

Data access cost = index IO + index paging result (15 rows in the example) corresponding to the table data IO. Therefore, this writing method consumes basically the same resources and time each time the page is turned, just like turning the first page.

Applicable scenarios: when the query and sort fields (that is, the fields involved in the where clause and order by clause) have a corresponding covering index, and the intermediate result set is large, it is applicable.

Table building optimization (very important)

①Create an index in the table, and give priority to the fields used by where and order by.

②Try to use numeric fields (such as gender, male:1, female:2), and try not to design fields that contain only numeric information as character types, which will reduce the performance of query and connection and increase storage overhead.

This is because the engine compares each character in the string one by one when processing queries and concatenations. For numeric types, only one comparison is sufficient.

③ Querying tables with a large amount of data will cause slow querying. The main reason is that there are too many scan lines. At this time, you can query by program, segment and page, loop traverse, and merge the results for display.

To query the data from 100000 to 100050, as follows:

SELECT * FROM (SELECT ROW_NUMBER() OVER(ORDER BY ID ASC) AS rowid,* FROM infoTab)t 
WHERE t.rowid > 100000 AND t.rowid <= 100050

④ Use varchar/nvarchar instead of char/nchar.

Use varchar/nvarchar instead of char/nchar as much as possible, because the storage space of variable-length fields is small, which can save storage space. Secondly, for queries, the search efficiency in a relatively small field is obviously higher.

Don’t think that NULL does not need space. For example: char(100) type. When the field is created, the space is fixed. Regardless of whether the value is inserted or not (NULL is also included), it will occupy 100 characters of space. If it is a varchar For such variable-length fields, null does not occupy space.

Guess you like

Origin blog.csdn.net/weixin_43941676/article/details/113941013