A small log, a big pit | JD Cloud technical team

1. Background

During the stress testing process, after optimizing the thread pool, there was a performance bottleneck in single-machine QPS. During the optimization process, it was found that the default thread pool and logs had a serious impact on performance, which triggered a series of log optimization.

2. Which scenarios may cause performance problems

In any system, logs are a very important component. They are an important basis for reflecting the operation of the system and a necessary clue for troubleshooting problems. Most people agree on the importance of logs, but what scenarios may cause performance problems? Today let us talk about Java log performance.

2.1 Unreasonable writing method

 

I believe everyone has seen the above three writing methods more or less in the project code. So are they different before, and what impact will they have on performance? If you turn off the DEBUG log level at this time, the difference will appear

Format 1 still needs to perform string concatenation, even if it does not output logs, it is a waste.

The disadvantage of format 2 is that the parameters will be JSON serialized in advance, which will also cause performance losses.

Therefore, format 3 is recommended. Add a new log switch, which will be dynamically spliced ​​only during execution. After turning off the corresponding log level, there will be no performance loss.

2.2 Unreasonable logs

Having as many logs as possible can string together user requests, making it easier to determine the location of the problematic code. Due to the current distributed system and complex business, the lack of any log is a huge obstacle for programmers to locate problems. Therefore, print logs reasonably and set reasonable log levels.

2.3 Log output format

The official website of location information is called Location Information, which means that the current log line is printed in which method and line of which class.

There are many configurable patterns, see the official website
https://logging.apache.org/log4j/2.x/manual/layouts.html#Patterns for details.

Here we only talk about location-related %C or %class, %F or %file, %l or %location, %L or %line, %M or %method.

The descriptions of these modes on the official website have repeatedly emphasized that they will affect performance. At the same time, specific performance data is also given, which is 1.3 to 5 times slower than commonly used synchronous loggers. If you use location information in an asynchronous logger, it will be 30 to 100 times slower.

3. How to avoid the impact of logs on performance

3.1 Dynamic adjustment of log level

Make good use of the DEBUG level! The project code needs to print a large number of INFO level logs to support problem location and test troubleshooting. However, these large amounts of INFO logs are ineffective in production environments. A large number of logs will eat up CPU performance. At this time, it is necessary to dynamically adjust the log level, so that the INFO logs can be viewed at any time and dynamically closed when not needed. Impact service performance needs.

3.2 Do not create useless logs

Log content should be as little as possible, do not type in loops, simplify large lists, and do not type useless content.

Do not print the exception stack that is obviously known (for example, just print the exception information directly after customizing the exception capture)

3.3 Avoid string concatenation

Avoid string concatenation: String concatenation is an expensive operation in logging, especially if used in a loop. Each string concatenation will produce a new string object, which wastes memory and time. Priority should be given to using placeholders, such as using "{}" in the slf4j library, and then passing in parameters, and avoiding using string concatenation.

3.4 Added new log switch

Add new log switches as needed to reduce unnecessary performance losses, such as JSON serialization and string concatenation, etc. (If there are no related operations, there is no need to add a new log switch, but there will be a bunch of waste code)

3.5 Adjust log output format

Location information that affects performance is selected on demand to reduce performance loss.

3.6 Asynchronous printing of logs (choose carefully)

Disk I/O for synchronous printing logs becomes a bottleneck, resulting in a large number of thread blocks, and asynchronous failure may cause log loss.

4. Optimize results

4.1 Before optimization (single machine 80qps.. performance is no longer available and takes up to 1500+ms):

 

After 4.2 optimization (single machine 200qps tp999 is stable at 575ms):

Author: JD Retail Wang Jun

Source: JD Cloud Developer Community Please indicate the source when reprinting

IntelliJ IDEA 2023.3 & JetBrains Family Bucket annual major version update new concept "defensive programming": make yourself a stable job GitHub.com runs more than 1,200 MySQL hosts, how to seamlessly upgrade to 8.0? Stephen Chow's Web3 team will launch an independent App next month. Will Firefox be eliminated? Visual Studio Code 1.85 released, floating window Yu Chengdong: Huawei will launch disruptive products next year and rewrite the history of the industry. The US CISA recommends abandoning C/C++ to eliminate memory security vulnerabilities. TIOBE December: C# is expected to become the programming language of the year. A paper written by Lei Jun 30 years ago : "Principle and Design of Computer Virus Determination Expert System"
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4090830/blog/10320654