MySQL in the classic too many connection how broken

 

about the author

Lan Chun, 58 live off senior DBA, focusing on operation and maintenance and operations MySQL field.

 

First, what is too many connection

 

1, important parameters

 

max_user_connections: The maximum number of simultaneous connections permitted to any given MySQL user account

The maximum number of users allowed for each link, if more than this value, will be reported: ERROR 1203 (42000): User dba already has more than 'max_user_connections' active connections.

 

This error usually occurs only in business machines, and does not complain at DB server layer, so DBA can not really perceived wrong, MySQL also launched a very intimate status for DBA View: Connection_errors_max_connections

 

<section class="135brush" style="margin: 10px 0px; padding: 15px 20px 15px 45px; font-size: 14px; line-height: 22.39px; outline: 0px; border-width: 0px; border-style: initial; border-color: currentcolor; color: rgb(62, 62, 62); vertical-align: baseline; box-sizing: border-box; background-image: url(" https:="" mmbiz.qlogo.cn="" mmbiz_jpg="" tibrg3aoijtvy5gucqkfy5hqooqnktqmcc1e2igtetiaodqfbqphxthjdmycxagsoko2flsvbtyh2tekiklw2vcg="" 0?wx_fmt="jpeg&quot;);" ="" background-position:="" 1%="" 5px;="" background-repeat:="" no-repeat;"=""> Connection_errors_max_connections : The number of connections refused because the server max_connections limit was reached.

Observant students will find: What if 'max_user_connections' the error occurs, you can not find it, I have not found a corresponding piece of the current status.

 

Two, too many connection occur under what circumstances

 

1, slow query cause

 

  • The real slow: The query is indeed very slow

  • Disguised slow: The query itself is not slow, is affected by other factors led to

 

2, sleep induced air connection

 

  • I do not have any query, just sleep, this is generally the code there is no active links lead to the timely release.

 

Third, the real case

 

1, TMC sleep caused by air links (too many connection referred to)

 

the reason

 

Since the code does not take the initiative and timely release of links, so there are a lot of sleep link DB Server, once the error exceeds max_connections.

 

solution

 

(1) encounter such an error, if not solved in time, will lead to the back of the business have been Rom database, a great impact surface.

 

(2) So the first thing we need is to protect the database, kill off these links sleep. About kill this thing, there are a lot of skills can talk about:

 

  • If it is artificial kill, this is simply unable to complete such a difficult task, because the business will always have such links sleep, with or without end

  • If you write your own script, not seconds to kill, of course, feasible. But we did come across very extreme cases, that is, kill MySQL can not respond to your request.

  • So, there's a more reliable solution is to: Set wait_timeout, it will automatically help you complete this large and difficult task, and will be able to kill off

 

(3) After completing the above steps, you can ensure that the database will not be pressed to, and you have a chance to do some landing into the management of things, but also to solve the business side must be allowed to sleep to deal with these links.

 

  • Business team investigation and the reasons not to release the link.

  • In general, if you can, DBA assistance provided by top ip TMC traffic during the investigation and let the business side where service exception.

 

(4) Enable thread_pool feature may solve this problem, but for various reasons did not use.

 

  • MySQL Community Edition does not support official

  • Can not solve slow query caused by TMC

  • Probably because this component is causing its own problems

 

2, slow query caused by TMC

 

(1) First is that the real slow query

 

In general this case, it is very clear, to find it, optimize it, of course, provided your database is still alive.

 

We usually have a SQL firewall protection, greatly reduces this risk. Why predict SQL firewall was, Let's hear next share.

 

(2) slow query disguised

 

Well, finally began to introduce this most difficult of failure scenarios.


Difficulty is: because it is not real slow, difficult to find the optimization point, the so-called remedy is to find the corresponding symptoms, and this is the hard part.


Ado, introduced a real case encountered at some time ago here.

Symptom

 

  1. too many connection error

  2. threads_runnig very much

  3. There are almost no problem query, there is no obvious slow query

  4. Almost any statements become very slow

  5. Server IO pressure is not great

Failure Analysis

 

  Detailed important parameters

 

The official interpretation of the document I will not say here about yourself understood.

 

  • innodb_thread_concurrency: enter the number of threads innodb storage engine, if the number of full, look for the line

  • innodb_thread_sleep_delay: waiting in line to enter the innoDB when the need for sleep How long

  • innodb_adaptive_max_sleep_delay: Set an adaptive maximum sleep time

  • innodb_concurrency_tickets: Once in innoDB, will get a ticket tickets, tickets are free to enter during innoDB do not need to line up, run out if, in theory, will have to line up (after the discovery is not strictly observed this mechanism)

 

Failure to reproduce the test

 

Table Structure

 

The key parameter settings

 

set global innodb_thread_concurrency = 1; - easy Simulation

Test case execution time of three statements begin one second is not bad

 

Tracking Results

 

 

to sum up  

 

1. Through the above analysis results and test results: When the query over innodb_thread_concurrency, the remaining query will wait in a timely manner so that the query is very fast, it will still wait, this is called camouflage slow query.

 

2. 通过trx_started,now()分析得出:这些query直接的切换轮询并不是真正意义上的平均公平分配,里面有一套自己的自适应算法,这里面我没有深究下去,有兴趣的同学可以继续了解源码。

 

3. 既然真正的原因找到,那么解决方案也就很快出来,那就是让并发线程少一点,通过我们的omega平台可以很方便地得出这段时间哪些query和connect最多,那么协助业务一起沟通业务场景和优化方案,问题得到解决。

 

文章来源:云栖社区,经同意授权转载

链接:https://yq.aliyun.com/articles/226984?spm=5176.8091938.0.0.nCksaV

Guess you like

Origin www.cnblogs.com/pejsidney/p/11138463.html