nodejs memory overflow FATAL ERROR: CALL_AND_RETRY_0 Allocation failed - process out of memory connect ECONNREFUSED 127.0.0.1:80 Error Resolution

  After the spa project overall migration into ssr, after all changes deployed Fortunately, the number of visits is suddenly too much one day, node process easily hung up automatically restart.

  Finally, pressure tested, taking into account the heap memory overflow problem, they reported the error: FATAL ERROR: CALL_AND_RETRY_0 Allocation failed - process out of memory

1, replicate the results:

  Jmeter make use of stress tests, 1s50 times for requests, observed node process memory situation

  After persistent requests observed, node process memory has been increased, and finally reach 1.4G, will not rise again, because 64-bit system default assigned to the node on-line process is 1.4G, 32-bit systems like 0.7G.

  After reaching 1.4G, last about 1/2 minutes, hung process, heap memory overflow error: FATAL ERROR: CALL_AND_RETRY_0 Allocation failed - process out of memory

2, the process of resolving

  At first I did not know the reason, since before there has been an error on the article: Connect ECONNREFUSED 127.0.0.1:80 errors solve problems, so the first thought is that this denial leads to the accumulation of a large number of connection leads, so to solve the above problems.

  But then solve the above problem, still to no avail, or error. Taking into account the problem of the page, so for a purely static page request to see if the problem because the code page cause a memory leak, the result of pure static page request is the same situation, which does not know what the reason.

  Note: The fact that we should take into account the level up, go to the layers of screening , there will be those that should be considered before processing into the page, such as nuxt in plugins, before entering the page you need to instantiate these things. To think on treatment which will not cause a memory leak.

  And I did not take into account that this layer, so caught up in the blind spot problem solving, can only consider the use of tools to query memory snapshot, and then find places where a memory leak points.

  Later, taking into account the higher level, so go out there to look for plugins and found that there is indeed a point of memory leaks, and also because the overall migration ssr, not build a reconstructed from 0-1, so the code structure is not too much attention. I found that the global interceptor tripartite axios is introduced, it will be introduced once every interception, resulting in a large number of references to resource-intensive. The problem is found, then get rid of, instead reference the same tripartite resources, there is no problem. Then all the plugins in duplicate references are replaced by a reference to the same resource, it can reduce a portion occupied.

  After changing for the better, build, and then do Jmeter pressure measurement, monitoring more than 100 million times sample detection, anomaly rate is very low 0.1%, and the node process will not rise up, take up memory stable at between 200-250M, the problem is resolved .

  Record it mainly to re-set about ideas to solve the problem, because the problem-solving ideas is more important than the solution to the problem: should be from bottom to top, layers of screening, not the underlying problem, should be considered on the floor next to it will there is no problem, so you can quickly locate , because I did not continue up to consider a layer, resulting in quite a few detours.

Guess you like

Origin www.cnblogs.com/goloving/p/11441054.html