Sql statement optimization and indexing

Sql statement optimization and indexing

1.Innerjoin and left join, right join, sub query

A. inner join is also called equal join, left/rightjoin is outer join.

SELECT A.id,A.name,B.id,B.name FROM A LEFT JOIN B ON A.id =B.id;

SELECT A.id,A.name,B.id,B.name FROM A RIGHT JOIN ON B A.id= B.id;

SELECT A.id,A.name,B.id,B.name FROM A INNER JOIN ON A.id =B.id;Inner

join performance comparison has been confirmed by many aspects Faster, because inner join is an equi-join and may return fewer rows. But we must remember that some statements implicitly use equi-join, such as:

SELECT A.id,A.name,B.id,B.name FROM A,B WHERE A.id = B.id;

Recommendation: inner can be used Join connection try to use inner join connection

B. The performance of subqueries is slower than that of outer joins. Try to replace subqueries with outer joins.

  Select* from A where exists (select * from B where id>=3000 and A.uuid=B.uuid);

The data in table A is a 100,000-level table, and table B is a million-level table. It takes about 2 seconds to execute on the local machine. We can see that the subquery is a correlated subquery (DEPENDENCE SUBQUERY) through explain; MySQL is the first Execute a full table query on foreign table A, and then execute subqueries one by one according to uuid. If the outer table is a large table, we can imagine that the query performance will be worse than this.

  A simple optimization is to use the inner join method to replace the subquery, and the query statement is changed to:

   Select* from A inner join B using(uuid) where b.uuid>=3000;

  This statement executes the test in less than one second;

C. When using ON and WHERE, remember their order, such as:

SELECT A.id,A.name,B.id,B.name FROM A LEFT JOIN B ON A.id =B.id WHERE B.NAME=' The XXX'

execution process will first execute ON and then filter out some rows in the B table. However, WHERE is to filter the records generated by their two connections.

But here's a reminder: the conditions behind ON can only filter out the number of rows in table B, but the number of rows returned by the connection is the same as the number of rows in table A. For example:

SELECT A.id,A.name,B.id,B.name FROM A LEFT JOIN B ON A.id =B.id;

the number of records returned is the number of records in table A, and the conditions after ON only have Filter the number of records in table B, and the number of records returned by

SELECT A.id,A.name,B.id,B.name FROM A ,B WHERE A.id = B.id

is a Cartesian product, which is consistent with A. id = B.id records for this condition

D. When using JOIN, you should use small results to drive the results (the left join table results should be as small as possible, if there are conditions, it should be placed on the left side first, and the right join is the same and reversed). Split multiple queries (multiple table queries are inefficient, easy to lock tables and block). For example:

Select * from A left join B ona.id=B.ref_id where B.ref_id>10;

can be optimized as: select * from (select * from A wehre id >10) T1 left join B onT1.id=B. ref_id;

2. Build indexes to speed up query performance.

A. When building a composite index, if the field used in the where condition is in the composite index, it is best to put this field at the leftmost end of the composite index, so that the index can be used and the query can be improved.

b. Ensure that the index of the connection is of the same type, which means that the fields associated with the A table and the B table must be of the same type. These types are all indexed so that both tables can use the index. If the types are different, at least one table cannot use the index.

c. Indexes, not only primary and unique keys, but also any other column. When using like one of the indexed field columns.

For example: select *from A name like 'xxx%';

this sql will use the index of name (the premise is that the name is indexed); and the following statement cannot use the index

Select * from A name like '%xxx';

because '% ' stands for any character, %xxx doesn't know how to index, so the index cannot be used.

D. Compound Index

For example, there is a statement like this: select* from users where area ='beijing' and age=22;

If we create indexes on area and age respectively, since mysql query can only use one index at a time, although this has been Compared with no index, full table scan improves a lot of efficiency, but if a composite index is created on the area and age columns, it will bring higher efficiency. If we create a composite index of (area, age, salary), then it is actually equivalent to creating three indexes (area, age, salary), (area, age), (area), which is called the best left prefix feature . Therefore, when we create a composite index, we should put the column most commonly used as a constraint on the leftmost, decreasing in turn.

E. The index will not contain columns with NULL values.

As long as the columns contain NULL values, they will not be included in the index (unless it is a unique value domain, which can have a NULL value), as long as there is one column in the composite index that contains NULL values , then this column is not valid for this composite index. So we don't let the default value of the field be NULL in database design.

F. Use short index to index

the string column, if possible, you should specify a prefix length. For example, if you have a CHAR(255) column, if most values ​​are unique within 10 or 20 characters, then don't index the entire column. Short indexes can not only improve query speed but also save disk space and I/O operations.

g. Sorted index problem

Mysql query only uses an index, so if the index has been used in the where clause, the column in the order by will not use the index. Therefore, the default ordering of the database can meet the requirements and do not use the ordering operation; try not to include the ordering of multiple columns, if necessary, it is best to create a composite index for these columns.

3.Optimize when the limit is tens of millions of levels of paging.

A. We usually use limit, such as:

Select * from A order by id limit 1,10;

In this way, when there is very little table data, there is no performance problem. If it reaches tens of millions, such as:

Select * from A order by id limit10000000,10;

Although all are Only 10 records are queried, but this performance is unbearable. So why do we continue to use persistence layer frameworks such as hibernate and ibatis when the table data is large, there will be some performance problems unless the persistence layer framework optimizes these large data tables.

b. In the above situation, we can use another statement to optimize, such as:

Select * from A where id>=(Select id from a limit 10000000,1) limit 10;

This is indeed much faster, but the premise is that the id field is established index. Maybe this is not optimal. In fact, it can be written like this:

Select * from A where id between 10000000and 10000010;

This is more efficient.

4. Try to avoid Select * commands

A. The more data is read from the table, the slower the query will become. It will increase the disk operation time, or in the case where the database server is separate from the web server, you will experience very long network delays. Simply because data is being transferred between servers unnecessarily.

5. Try not to use the BY RAND() command

A. If you really need to display your results randomly, there are many better ways to do it. And this function might execute a BY RAND() command for each individual row in the table - which would consume processing power from the processor, and give you just one row back.



6. Use limit 1 to get the only row

A. Sometimes when you want to query a table, you need to know that you need to look at a row, you may be querying a unique record. You can use limit 1. to stop the database engine from continuing to scan the entire table or index, such as:

Select * from A where namelike '%xxx' limit 1;

so long as the query matches records like '%xxx', the engine will not continue Scan the table or index.



7. Try to sort as little as possible

A. Sorting operations will consume more CPU resources, so reducing sorting can improve the cache hit rate



8. Minimize OR

A. When there are multiple conditions in the where clause that coexist with "or", Mysql's optimizer does not solve the optimization problem of its execution plan very well, coupled with the unique sql and storage layered architecture of mysql, its performance is relatively low, and union all or union (when necessary) is often used. way instead of "or" will get better results.



9. Try to use union all instead of union

A. The difference between union and union all is that the former needs to combine two (or more) result sets before performing the unique filtering operation, which will involve sorting and increase a lot of cpu operations, increasing resource consumption and delay. So when we can confirm that duplicate result sets are impossible or do not care about duplicate result sets, try to use union all instead of union.

10. Avoid type conversion

A. The "type conversion" mentioned here refers to the occurrence of column in the where clause Type conversion that occurs when the type of the field is inconsistent with the type of the incoming parameter. The artificial conversion is performed by the conversion function, which directly causes MySQL to be unable to use the index. If you have to convert it, you should convert it on the incoming parameter.



11. Don’t operate on columns

A. As follows: select * from users where YEAR(adddate)<2007; will perform operations on each row, which will cause the index to fail and perform a full table scan, so we can change it to:

Select * from users where adddate<'2007-01 -01';



12. Try not to use NOT IN and <> operations

. A. NOT IN and <> operations will not use indexes, but will perform a full table scan. NOT IN can be replaced by NOT EXISTS, and id<>3 can be id>3 or id <3; if NOT EXISTS is a subquery, it can be converted into an outer join or an equijoin as much as possible, depending on the business logic of the specific SQL.

b. Convert NOT IN to LEFT JOIN such as:

SELECT * FROM customerinfo WHERE CustomerIDNOT in (SELECT CustomerID FROM salesinfo );

Optimization:

SELECT * FROM customerinfo LEFT JOINsalesinfoON customerinfo.CustomerID=salesinfo. CustomerID WHEREsalesinfo.CustomerID IS NULL;



13. Use bulk insert to save Interactive (preferably using stored procedures)

A. Try to use insert intousers(username,password) values('test1','pass1'), ('test2','pass2'), ('test3','pass3') ;



14. Lock table

A. Although transaction is a very good way to maintain database integrity, but because of its exclusivity, it sometimes affects the performance of the database, especially in many application systems. Due to the process of transaction execution, the database will is locked, so other user requests can only wait temporarily until the transaction settles. If a database system is used by only a few users, the impact of the transaction will not be a big problem; but assuming that there are thousands of users Accessing a database system at the same time, such as accessing an e-commerce website, will result in a serious response delay. In fact, in some cases, we can obtain better performance by locking the table. For example:

LOCK TABLE inventory write

Select quanity from inventory whereitem='book';



Update inventory set quantity=11 whereitem='book';

UNLOCK TABLES;

Here, we use a select statement to fetch the initial data, and through some calculations, update the new value to the list with an update statement. A LOCK TABLE statement that includes the write keyword ensures that no other accesses to the inventory will insert, update, or delete operations until the UNLOCK TABLES command is executed.



15. For multi-table related queries, create a view

A. The association of multiple tables may have performance problems. We can create views on multiple tables. This way, if the operation is simple, data security is increased. Through the view, users can only query and modify the specified data. In addition, the logical independence of the table is improved, and the view can shield the impact of changes in the original table structure.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326567446&siteId=291194637