How to write a good hand SQL?

MySQL Performance

The maximum amount of data to
set aside the amount of data and the number of concurrent, talk about the performance of bullying. MySQL is no limit on the maximum number of records in a single table, it depends on the operating system limit on file size.
Here Insert Picture Description
"Ali Baba Java Development Manual" put forward a single table rows over 5 million lines or single-table capacity of more than 2GB, it is recommended sub-library sub-table. Performance is determined by a combination of factors, put aside the complexity of the business, the impact is followed by the hardware configuration, MySQL configuration, data table design, index optimization. 5000000 This value is only for reference, not an iron law.

Tab 20 check the latest recording takes 0.6 seconds, SQL statements, generally select field_1, field_2 from table where id <# {prePageMinId} order by id desc limit 20, prePageMinId ID smallest previous data record.

At the time, query speed okay, as data continues to grow, one day must be overwhelmed. Sub-library sub-table is a long period and large high-risk job, you should try to optimize on the current structure, such as upgrading hardware, migrate historical data, etc., it Meizhe subdivision. Sub-library sub-table for interested students can read the basic idea of ​​sub-library sub-table.

The maximum number of concurrent
concurrent database can refer to the same time the number of requests processed, is determined by the max_connections and max_user_connections. refers to the maximum number of connections max_connections MySQL instance, the upper limit value is 16384, max_user_connections is the maximum number of database connections per user.

MySQL will provide a buffer for each connection, which means consuming more memory. If the connections are set too high hardware too much, is too low and can not take full advantage of the hardware. General requirements for both the ratio exceeds 10%, calculated as follows:

max_used_connections / max_connections * 100% = 3/100 *100%3%

View the maximum number of connections and response maximum number of connections:

show variables like '%max_connections%';
show variables like '%max_user_connections%';

Modify the maximum number of connections in the configuration file my.cnf

[mysqld]
max_connections = 100
max_used_connections = 20

Query takes 0.5 seconds
is recommended that a single query took control in less than 0.5 seconds, 0.5 seconds is the experience points, three seconds from the principle of the user experience. If the user's operation does not respond within three seconds, it will even out of boredom. Response time = UI rendering client network requests Processed Processed + + + applications consuming process consuming database query, the processing time is 0.5 seconds left 1/6 database.

Implementation of the principle
compared NoSQL database, MySQL is a delicate fragile guy. It is like the female students on physical education, and a little dispute on the students arguing (expansion difficult), ran two steps out of breath (low-capacity small concurrent), often ill to leave (SQL constraints too much). Today I will point out a distributed, application expansion is much easier than the database, so less work is the database implementation of the principles, applications, and more work.

  • But do not take full advantage of the abuse index, index notes also consume disk and CPU.
  • Not recommended to use the database function to format the data to the application process.
  • Not recommended to use foreign key constraints to ensure the accuracy of the data with the application.
  • Write Once Read Many small scenes, is not recommended to use a unique index, use the application to ensure uniqueness.
  • Appropriate redundant field, try to create an intermediate table, intermediate results of calculations with the application, space for time.
  • Not allowed to perform extremely time-consuming affairs, with the application split into smaller transactions.
  • Estimated important data sheet (such as order table) and load data growth, optimize advance.

Data table design

Data Type
Select the data type of principle: simpler or smaller footprint.

  • If the length can be satisfied, to make use of an integer tinyint, smallint, medium_int not int.
  • If the string length is determined, using the char type.
  • If varchar meet, without using text type.
  • The use of high precision decimal type, BIGINT may also be used, such as two decimal accuracy multiplied by 100 to save.
  • Try using timestamp instead of datetime.
    Here Insert Picture Description
    Compared datetime, timestamp take up less space, the storage zone is automatically converted to UTC time format.

Avoid null values
in MySQL field is still NULL space, will make the index, the index statistics more complex. NULL value is updated to a non-NULL update can not be done in situ from, prone to split affect the performance of the index. As far as possible NULL values instead of meaningful value, but also to avoid SQL statement which contains the judgment is not null.

Optimization of type text
because text fields to store large amounts of data, table capacity will go up very early, query performance other fields. We recommend drawn out on the child table, with associated natural key.

Index Tuning

Index Classification

  • Ordinary Index: basic index.
  • Composite index: indexing the plurality of fields, the composite can be accelerated retrieval query.
  • The only index: Similar to ordinary indexes, but the value of the index columns must be unique, allow nulls.
  • A combination of a unique index: a combination of column values ​​must be unique.
  • Primary key index: special unique index, a record for a unique identification data in the table, allow nulls, usually with a primary key constraint.
  • Full-text index: for mass text query, InnoDB and MyISAM after MySQL5.6 support full-text indexing. Because the query precision and scalability poor, more companies choose Elasticsearch.

Index Tuning

  • Paging query is very important, if the amount of query data exceeds 30%, MYSQL will not use the index.
  • Single table index number not more than 5, a single index field number no more than five.
  • String prefix index may be used, the prefix length of the control characters 5-8.
  • The only field is too low, increase the index does not make sense, such as: whether to remove the gender.

Rational use of a covering index, as follows:

select login_name, nick_name from member where login_name = ?

login_name, nick_name two fields to establish a composite index, a simple index is faster than login_name.

SQL optimization

Batch processing
bloggers see a child ponds dug a small hole in the drain, the water there are all kinds of floating debris. Duckweed and leaves can always pass the outlet, and will block other objects through the branches, and sometimes get stuck, the need for manual cleaning. MySQL is a fish pond, and the maximum number of concurrent network bandwidth is the outlet, the user SQL is floating.

Queries with no paging parameters, or the impact of large amounts of data update and delete operations, all the branches, we want it to break up a batch process, example:

Business Description: update users all expired coupons unavailable.

SQL statement:

update status=0 FROM `coupon` WHERE expire_date <= #{currentDate} and status=1;

If a large number of coupons need to be updated unavailable state, executes the SQL may be blocked other SQL, batch processing of the pseudo-code is as follows:

int pageNo = 1;
int PAGE_SIZE = 100;
while(true) {
    List<Integer> batchIdList = queryList('select id FROM `coupon` WHERE expire_date <= #{currentDate} and status = 1 limit #{(pageNo-1) * PAGE_SIZE},#{PAGE_SIZE}');
    if (CollectionUtils.isEmpty(batchIdList)) {
        return;
    }
    update('update status = 0 FROM `coupon` where status = 1 and id in #{batchIdList}')
    pageNo ++;
}

Operators <> Optimization
typically <> operator can not use the index, for example as follows, the query is not the amount of $ 100 orders:

select id from orders where amount != 100;

If the amount is under 100 orders for rare, severe uneven distribution of data such circumstances, it is possible to use the index. Given this uncertainty, the search results using the polymerization union, rewritten as follows:

(select id from orders where amount > 100)
 union all
(select id from orders where amount < 100 and amount > 0)

OR optimization
in Innodb engine or can not use the composite index, such as:

select id,product_name from orders where mobile_no = '13421800407' or user_id = 100;

Mobile_no + user_id not hit OR combination of the index, Union employed, as follows:

(select id,product_name from orders where mobile_no = '13421800407')
 union
(select id,product_name from orders where user_id = 100);

At this point id and product_name field has an index, the query is most efficient.

Optimization IN
IN IN LARGE main table for small tables, EXIST main table for big kid table. Because the query optimizer escalating, many scenes both performance almost the same thing.
Try instead join query, for example as follows:

select o.id from orders o left join user u on o.user_id = u.id where u.level = 'VIP';

Using JOIN shown below:

select o.id from orders o left join user u on o.user_id = u.id where u.level = 'VIP';

Column operation is not
usually query the index column operation will lead to failure, as shown below:
query the date of the order

select id from order where date_format(create_time,'%Y-%m-%d') = '2019-07-01';

date_format function causes the query can not use the index, after rewrite:

select id from order where create_time between '2019-07-01 00:00:00' and '2019-07-01 23:59:59';

Select all be avoided
if you do not query all of the columns in the table, avoid using SELECT *, it will be a full table scan, can not effectively use the index.

Like optimized
like a fuzzy query, for example (field indexed):

SELECT column FROM table WHERE field like '%keyword%';

This query misses the index and replaced with the following wording:

SELECT column FROM table WHERE field like 'keyword%';

In addition to the previous query% will hit the index, but the product manager must be fuzzy match before and after it? Full-text indexing fulltext can try, but Elasticsearch is the ultimate weapon.

Join optimization
achieved join to employ Nested Loop Join algorithm, the result is set by the drive as the basic data table, the data through the node to the next as a filter condition table query data cycle, then combined the results. If multiple join, in front of the result is set as the cyclic data after a re-query the data tables.

Table-driven table and driven increase query as to meet the ON condition and less Where, with little result set to drive large result sets.
Is indexed and join field on the drive table, time can not be indexed, provision of adequate Join Buffer Size.
Prohibit join connect more than three tables, try to increase the redundancy field.

Limit optimization
limit principles for paging query the next turn worse performance, solution: Reduce the scan area, as shown below:

select * from orders order by id desc limit 100000,10

It takes 0.4 seconds

select * from orders order by id desc limit 1000000,10

It takes 5.2 seconds

First screened ID narrow your search, worded as follows:

select * from orders where id > (select id from orders order by id desc  limit 1000000, 1) order by id desc limit 0,10

It takes 0.5 seconds

If the query conditions only the master key ID, worded as follows:

select id from orders where id between 1000000 and 1000010 order by id desc

It takes 0.3 seconds

If the above program is still very slow? I had to use the cursor, and interested friends to read JDBC use the cursor implement paging query

Other databases

As a back-end developer, be sure proficient in MySQL or SQL Server as the storage core, but also an active interest in NoSQL database, they have matured and are widely used enough to solve performance bottlenecks in specific scenarios.
Here Insert Picture Description

Published 180 original articles · won praise 13 · views 7161

Guess you like

Origin blog.csdn.net/weixin_45794138/article/details/104897365