Linux system in the solution of 120 seconds and blocked for more than hung_task_timeout_secs

 

Linux system in the system is not responding. A large number appears in / var / log / message log "echo 0> / proc / sys / kernel / hung_task_timeout_secs" disables this message. " , And " blocked for more than 120 seconds " error.

problem causes:

By default, Linux will use up to 40 percent of available memory as the file system cache. When this threshold is exceeded, the file system cache memory will be written to disk all, lead to subsequent IO request are synchronized. When the cache is written to disk, there is a default timeout of 120 seconds. The reason for the above problem is IO subsystem processing speed is not fast enough, not all the data in the cache is written to disk in 120 seconds. IO system response is slow, resulting in the accumulation of more and more requests, eventually occupied the entire system memory, cause the system to become unresponsive.

Solution:

Depending on the application situation, the vm.dirty_ratio, vm.dirty_background_ratio two tuning parameter settings. For example, recommended the following settings:
# sysctl -w vm.dirty_ratio = 10
# sysctl -w vm.dirty_background_ratio. 5 =
# sysctl -p

If the system is permanent, modify /etc/sysctl.conf file. Add the following two lines:
#vi /etc/sysctl.conf

vm.dirty_background_ratio = 5
vm.dirty_ratio = 10

sysctl -p

 

Recently encountered a problem:

Apache server load soared to more than 700, making it impossible to provide normal http service, while at this time to view the system log and found:

Jan  4 09:57:03 locasv107 kernel: INFO: task httpd:18463 blocked for more than 120 seconds.
Jan  4 09:57:03 locasv107 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

Like alarm log.

For the background to understand the parameters of the task due to the pending timeout after query the information
above also gives the error message from the simple solution is to disable the timeout 120 seconds: echo 0> / proc / sys / kernel / hung_task_timeout_secs
then asked the host engineer: the program is given in accordance with the warning prompt to remind the disable

后续询问后给出如下解释:
This is a know bug. By default Linux uses up to 40% of the available memory for file system caching.
After this mark has been reached the file system flushes all outstanding data to disk causing all following IOs going synchronous.
For flushing out this data to disk this there is a time limit of 120 seconds by default.
In the case here the IO subsystem is not fast enough to flush the data withing 120 seconds.
This especially happens on systems with a lof of memory.

IS the Solved problem in later at The kernels' and there IS not "FIX" from the Oracle.
The I Fixed by Lowering the this at The Mark for Flushing at The Cache from 40% to 10% by Setting "vm.dirty_ratio = 10" in /etc/sysctl.conf .
this does not Influence setting Overall Database Performance Operating since you hopefully use Direct IO and Bypass at the Cache File system Completely.
told that linux will be set to 40% of the available memory used for system cache, when flush the data that 40% of the memory Since the synchronization problem results in a timeout and IO (120s), by 40% to 10% decrease, avoid the timeout.

 

Reprinted: https: //www.cnblogs.com/wshenjin/p/7093505.html

https://blog.csdn.net/yanggd1987/article/details/42388421

Guess you like

Origin www.cnblogs.com/xibuhaohao/p/11096163.html