10 Tips for Using Logs Correctly

Source: macrochen

Link: http://macrochen.iteye.com/blog/1399082

Posted on February 11, 2012

To be a hard-working Java siege master, in addition to the high-level issues of the system architecture, we also need to understand some language traps, exception handling, and log output, these "trivial" details. This article JCP member, Tomasz Nurkiewicz ( http://nurkiewicz.blogspot.com/ ) summed up 10 tips on how to use logging correctly (see the original text ). Like the " worst practice in java programming ", it is also for some details , because the log is an important clue for us to troubleshoot problems and understand the status of the system. I think it is very useful for our usual coding. So convert it into Chinese, deepen your impression, and use it as a reference for your own work.

1) Choose the correct Log open source In order to know the status of the program's behavior in the code, the framework
usually prints a log:

log.info("Happy and carefree logging");



Of all the logging frameworks, I think the best is SLF4J. For example, in Log4J we would write:

log.debug("Found " + records + " records matching filter: '" + filter + "'");



Whereas in SLF4J we would write:

log.debug("Found {} records matching filter: '{}'", records, filter);



In terms of readability and system efficiency, SLF4J ( http://logback.qos.ch/ ) is better than Log4J (Log4J involves string concatenation and toString() method calls). Here {} brings Another advantage is that we do not need to add judgments like isDebugEnabled() for different log output levels without losing performance as much as possible.

SLF4J is just an interface abstraction (facade) for various log implementations, and the best implementation It is Logback, which is more active in the open source community than Log4J's siblings (both from Ceki Gülcü).

The last thing to recommend is Perf4J ( http://perf4j.codehaus.org/ ). In one sentence To summarize:
if log4j is regarded as System.out.println(), then Perf4J is System.currentTimeMillis().

Perf4J can help us output log information of system performance more conveniently. Then use other reporting tools to display the log in charts The form is displayed, so that we can analyze the performance bottleneck of the system. For the use of Perf4J, please refer to its developer guide ( http://perf4j.codehaus.org/devguide.html ).

The following is a log jar package dependency pom.xml reference template:

<repositories>
    <repository>
        <id>Version99</id>
        <name>Version 99 Does Not Exist Maven repository</name>
        <layout>default</layout>
        <url>http://no-commons-logging.zapto.org/mvn2</url>
    </repository>
</repositories>
 
 
<dependency>
    <groupId>org.slf4j</groupId>
    <artifactId>slf4j-api</artifactId>
    <version>1.5.11</version>
</dependency>
<dependency>
    <groupId>ch.qos.logback</groupId>
    <artifactId>logback-classic</artifactId>
    <version>0.9.20</version>
</dependency>
<dependency>
    <groupId>org.slf4j</groupId>
    <artifactId>jul-to-slf4j</artifactId>
    <version>1.5.11</version>
</dependency>
<dependency>
    <groupId>org.slf4j</groupId>
    <artifactId>log4j-over-slf4j</artifactId>
    <version>1.5.11</version>
</dependency>
<dependency>
    <groupId>org.slf4j</groupId>
    <artifactId>jcl-over-slf4j</artifactId>
    <version>1.5.11</version>
</dependency>
<dependency>
    <groupId>commons-logging</groupId>
    <artifactId>commons-logging</artifactId>
    <version>99.0-does-not-exist</version>
</dependency>



Here is the test code:

SLF4JBridgeHandler.install();
 
org.apache.log4j.Logger.getLogger("A").info("Log4J");
java.util.logging.Logger.getLogger("B").info("java.util.logging");
org.apache.commons.logging.LogFactory.getLog("C").info("commons-logging");
org.slf4j.LoggerFactory.getLogger("D").info("Logback via SLF4J");



The above code, no matter which log framework you use to output the log, the bottom layer uses logback, as to why, you can see here ( http://www.slf4j.org/legacy.html ), and here is not introduced in the classpath common-logging, using a little trick, is to set the dependency version to 99.0-does-not-exist, the description of this usage can be found here ( http://day-to-day-stuff.blogspot.com/ 2007/10/announcement-version-99-does-not-exist.html ), but the author of log4j thinks that the easiest way is to directly remove the dependency on common-logging, the relevant content can be seen here ( http:// www.slf4j.org/faq.html#excludingJCL )

2) Understand the correct log output level
Many programmers ignore the log output level, and don't even know how to specify the log output level. Compared with System.out, the logging framework The two biggest advantages are that you can specify the output category (category) and level (level). For the log output level, the following are some principles we should remember:
ERROR: A serious error has occurred in the system and must be dealt with immediately , otherwise the system will not continue to run. For example, NPE, database unavailable, etc.

WARN:The system can continue to operate, but attention must be paid. The existing problems can generally be divided into two categories: a system has obvious problems (for example, data is not available), and the other is a potential problem with the system that needs attention or Give some suggestions (for example, the system is running in safe mode or the account accessing the current system has security risks). In short, the system is still available, but it is better to check and adjust.

INFO: Important business logic processing is completed. Ideally , INFO's log information should be understood by advanced users and system administrators, and the current operating status of the system can be known from the log information. For example, for a flight reservation system, when a user completes a flight reservation operation, the reminder should be Gives "who booked the flight from A to B". Another place where INFO information needs to be output is when a system operation causes a major change in the state of the system (eg database update, excessive system requests).

DEBUG: mainly for For developers, it will be discussed further below.

TRACE: System details, mainly for developers, generally speaking, if it is an online system, it can be considered as a temporary output, and it can be turned off at any time by switching. Sometimes It is difficult for us to distinguish DEBUG and TRACE. In general, if it is a system that has been developed and tested, and log output is added to the system, then it should be set to the TRACE level. The

above is just a suggestion, you can also create a set of Your own rules. But a good logging system should first be able to quickly and flexibly adjust the output of the log content according to the situation.

The last thing to mention is the "notorious" is*Enabled() condition, such as the following way of writing :

if(log.isDebugEnabled())
    log.debug("Place for your commercial");



The performance gain from this approach is minimal (explained earlier when referring to SLF4J), and it is an over-optimized behavior. It is rarely necessary to write this, unless constructing log messages is very performance-intensive. Finally, remember One point: Programmers should focus on the log content, and leave the decision of the log output to the logging framework for non-processing.

3) Do you really know the content of the log output?
For every log information you output, please check carefully Whether there is a problem with the final output, the most important thing is to avoid NPE, such as the following:

log.debug("Processing request with id: {}", request.getId());



Can we guarantee that the request is not null here? In addition to NPE, sometimes we may also need to consider whether it will cause OOM? Out-of-bounds access? Thread starvation (log is synchronous)? Delayed initialization exception? Log burst disk, etc. . Another problem is outputting collections in the log. Sometimes the collection content we output may be retrieved from the database by Hibernate, such as the following log information:

log.debug("Returning users: {}", users);



The best way to handle this is to just output the id of the domain object or the size of the collection (size), but for Java, you have to complain a few words, it is very cumbersome to traverse the getId method of each element in the collection. This Groovy It's very simple to do (users*.id), but we can use the Commons Beanutils toolkit to help us simplify:

log.debug("Returning user ids: {}", collect(users, "id"));



The implementation of the collect method here is as follows:

public static Collection collect(Collection collection, String propertyName) {
    return CollectionUtils.collect(collection, new BeanToPropertyValueTransformer(propertyName));
}



But unfortunately, after a patch was proposed to Commons Beanutils (BEANUTILS-375 https://issues.apache.org/jira/browse/BEANUTILS-375 ), it was not accepted :(

Finally about the toString() method . In order to make the log easier to understand, it is best to provide a suitable toString() method for each class. Here you can use the ToStringBuilder utility class. The other one is about arrays and some collection types. Because arrays use the default toString method. And some collections don't implement toString method well. For arrays we can use JDK's Arrays.deepToString() method ( http://docs.oracle.com/javase/6/docs/api/java/util/Arrays. html#deepToString%28java.lang.Object[]%29 ).

4) Beware of side effects of

logging Sometimes logging will affect the behavior of the system more or less. For example, a recent situation encountered is that under certain conditions, Hibernate LazyInitializationException will be thrown. This is because some output logs cause the lazy-initialized collection to be loaded when the session is established. In some cases, when the log output level is increased, the problem disappears.

Another side effect is that the log causes the system to run. Slower and slower. For example, improper use of the toString method or string concatenation can cause performance problems in the system. A phenomenon I have encountered is that a system restarts every 15 minutes. This is mainly caused by thread starvation in the execution of log output. , Under normal circumstances, if the log generated by a system within an hour is several hundred MB, be careful.

And if the normal business logic is interrupted due to the problem of the log output itself, it will be even more serious. For example, the following code, it is best not to write it like this:

try {
    log.trace("Id=" + request.getUser().getId() + " accesses " + manager.getPage().getUrl().toString())
} catch(NullPointerException e) {}




5) The log information should be concise and descriptive

. Each log data will include description and context, such as the following log:

log.debug("Message processed");
log.debug(message.getJMSMessageID());

log.debug("Message with id '{}' processed", message.getJMSMessageID());



The first entry has only description, the second entry has only context, and the third entry is a complete log, as well as the following log:

if(message instanceof TextMessage)
    //...
else
    log.warn("Unknown message type");




The above warning log does not contain the actual message type, message id and other information, it just shows that there is a problem with the program, so what caused the problem? What is the context? We don't know anything.

Another problem is adding in the log The last inexplicable content, the so-called "magic log". For example, some programmers will type a string of characters like "&&&!#" in the log to help them locate.

The content of a log file should be easy to read , clear, and descriptive. Instead of using inexplicable characters, the log should show what the current operation did and what data was used. A good log should be seen as part of the documentation comments.

Finally, remember not to include passwords in the log and personal privacy information!

6) Correct use of output mode

log output mode can help us add some clear context information to the log. However, be careful with the information you add. For example, if you output a file every hour, In this way, your log file name already contains some date and time information, so there is no need to include this information in the log. In addition, in a multi-threaded environment, do not include the thread name in the log yourself, because this can also be in the mode. config.

In my experience, an ideal log pattern would contain the following information:

  • Current time (no need to include logs, accurate to milliseconds)
  • log level (if you care about this)
  • thread name
  • Simple log name (the kind that is not fully qualified)
  • log description information



For example, a logback configuration like the following:

<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
    <encoder>
        <pattern>%d{HH:mm:ss.SSS} %-5level [%thread][%logger{0}] %m%n</pattern>
    </encoder>
</appender>



Never include the following in log messages:

  • file name
  • Class name (I think this should be the fully qualified name)
  • code line number



There are also the following writings to avoid:

log.info("");



Because the programmer knows that the line number will be specified in the log mode, he can judge whether the specified method has been called according to the line number entered in the log (for example, it may be the authenticate() method here, and then judge that the logged-in user has passed the In addition, everyone should be clear that specifying the class name, method name and line number in the log mode will bring serious performance problems. Here is a simple test I did for this, the configuration is as follows:

<appender name="CLASS_INFO" class="ch.qos.logback.core.OutputStreamAppender">
    <encoder>
        <pattern>%d{HH:mm:ss.SSS} %-5level [%thread][%class{0}.%method\(\):%line][%logger{0}] %m%n</pattern>
    </encoder>
    <outputStream class="org.apache.commons.io.output.NullOutputStream"/>
</appender>
<appender name="NO_CLASS_INFO" class="ch.qos.logback.core.OutputStreamAppender">
    <encoder>
        <pattern>%d{HH:mm:ss.SSS} %-5level [%thread][LoggerTest.testClassInfo\(\):30][%logger{0}] %m%n</pattern>
    </encoder>
    <outputStream class="org.apache.commons.io.output.NullOutputStream"/>
</appender>



Here is the test code:

import org.junit.Test;
import org.perf4j.StopWatch;
import org.perf4j.slf4j.Slf4JStopWatch;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
 
public class LoggerTest {
 
    private static final Logger log = LoggerFactory.getLogger(LoggerTest.class);
    private static final Logger classInfoLog = LoggerFactory.getLogger("CLASS_INFO");
    private static final Logger noClassInfoLog = LoggerFactory.getLogger("NO_CLASS_INFO");
 
    private static final int REPETITIONS = 15;
    private static final int COUNT = 100000;
 
    @Test
    public void testClassInfo() throws Exception {
        for (int test = 0; test < REPETITIONS; ++test)
            testClassInfo(COUNT);
    }
 
    private void testClassInfo(final int count) {
        StopWatch watch = new Slf4JStopWatch("Class info");
        for (int i = 0; i < count; ++i)
            classInfoLog.info("Example message");
        printResults(count, watch);
    }
 
    @Test
    public void testNoClassInfo() throws Exception {
        for (int test = 0; test < REPETITIONS; ++test)
            testNoClassInfo (COUNT * 20);
    }
 
    private void testNoClassInfo(final int count) {
        StopWatch watch = new Slf4JStopWatch("No class info");
        for (int i = 0; i < count; ++i)
            noClassInfoLog.info("Example message");
        printResults(count, watch);
    }
 
    private void printResults(int count, StopWatch watch) {
        log.info("Test {} took {}ms (avg. {} ns/log)", new Object[]{
                watch.getTag(),
                watch.getElapsedTime(),
                watch.getElapsedTime() * 1000 * 1000 / count});
    }
 
}



In the above test code, the CLASS_INFO log outputs 15 million times, while NO_CLASS_INFO outputs 300 million times. The latter uses a static text to replace the dynamic class information in the log mode.

As can be seen from the comparison diagram below, the direct Specifying the class name in the log mode is 13 times faster than using reflection to dynamically obtain the class name (average output time per log: 8.8 vs 115 microseconds). For a java programmer, a log takes 100 Microseconds are acceptable. This means that a background application spends 1% of its time on output logs. So we sometimes need to weigh whether 100 log outputs per second is reasonable.


The last thing to mention is the more advanced features in the logging framework: Mapped Diagnostic Context. MDC ( http://www.slf4j.org/api/org/slf4j/MDC.html ) is mainly used to simplify thread-local based maps Parameter management. You can add any key-value content to this map, and then in the subsequent log output as part of the pattern, output with the current thread.

7) Add log to method input and output

when we are in the development process If a bug is found, we generally follow the debug method step by step until the final problem location is located (if the problem can be exposed by writing a failed unit test, it will be even more handsome ^_^). But if When the actual situation does not allow you to debug, such as a bug that appeared on the customer's system a few days ago. If you don't have detailed logs, can you find the source of the problem?

If you can output each method according to some simple rules The input and output (parameters and return values). You can basically throw away the debugger. Of course it must be reasonable to add logging for each method: accessing external resources (such as databases), blocking, waiting, etc., these places Consider adding a log. For example, the following code:

public String printDocument(Document doc, Mode mode) {
    log.debug("Entering printDocument(doc={}, mode={})", doc, mode);
    String id = //Lengthy printing operation
    log.debug("Leaving printDocument(): {}", id);
    return id;
}



Because logs are added before and after method calls, we can easily find performance problems of the code, and even find serious problems such as deadlocks and thread starvation: in this case, there are only input logs, no There is an output (leaving) log. If the method name and class name are used properly, the output log information will also be very pleasing to the eye. Because you can fully understand the operation of the system according to the log, it will become easier to analyze the problem. . In order to reduce the log code, you can also use simple AOP for log output. But be careful, this approach may generate a lot of logs.

For this kind of log, the DEBUG/TRACE level is generally used. When some method calls are very Frequent, then a large amount of log output will affect the performance of the system. At this time, we can increase the log level of related classes or simply remove the log output. However, in general, it is recommended that you output more logs. It becomes a kind of unit test. The output log will be like a unit test, which will cover the execution process of the entire method. A system without a log is unimaginable. Therefore, by observing the output of the log, we will know whether the system is running correctly or hanging.

8) Checking External Systems

with Logs Diagnosing and analyzing problems can be very difficult due to the need to integrate with multiple systems. For example, in a recent project, since we completely recorded message data (including SOAP and HTTP headers) on the Apache CXF web service, we It is very happy in the system integration and testing phase.

If multiple systems are integrated through ESB, you can use logs on the bus to record requests and responses. Here you can refer to Mules ( http://www.mulesoft. org/ ) of <log-component/>(http://www.mulesoft.org/documentation/display/MULE2USER/Configuring+Components ).

Sometimes the large amount of logs generated by communicating with external systems may be unacceptable for us. On the other hand, we want to open the log temporarily for testing , or when there is a problem with the system, we want to open a short-term output log. In this way, we can achieve a balance between outputting the log and ensuring system performance. Here we need to use the log date. For example, do the following:

Collection<Integer> requestIds = //...
 
if(log.isDebugEnabled())
    log.debug("Processing ids: {}", requestIds);
else
    log.info("Processing ids size: {}", requestIds.size());




In the above code, if the log level is configured as DEBUG, then all requestIds information will be returned. But if we configure the INFO level, only the number of requestIds will be output. But like the side effects of the log we mentioned earlier, If requestIds is null at INFO level, NullPointerException will be generated.

9) Correctly output exception logs

The first principle for log output is not to use logs to output exceptions, but to let the framework or container handle them. Logging exceptions should not be loggers Works. Many programmers will log the exception, and may return a default value (null, 0 or an empty string). It may also be wrapping the exception and throwing it. For example, the following code:

log.error("IO exception", e);
throw new MyCustomException(e);



The result of this may cause the exception information to be printed twice (once where it is thrown, and again where it is caught and processed).

But sometimes we really want to print exceptions, so how should we deal with them? For example, the following for NPE deal with:

try {
    Integer x = null;
    ++x;
} catch (Exception e) {
    log.error(e);        //A
    log.error(e, e);        //B
    log.error("" + e);        //C
    log.error(e.toString());        //D
    log.error(e.getMessage());        //E
    log.error(null, e);        //F
    log.error("", e);        //G
    log.error("{}", e);        //H
    log.error("{}", e.getMessage());        //I
    log.error("Error reading configuration file: " + e);        //J
    log.error("Error reading configuration file: " + e.getMessage());        //K
    log.error("Error reading configuration file", e);        //L
}



In the above code, only G and L can correctly output exception information, A and B cannot even be compiled in SLF4J, and others will lose exception stack information or print inappropriate information. Just remember one here and output it in the log Exception information, the first parameter must be a string, which is generally the description information of the problem, not the exception message (because there will be in the stack), and the second parameter is the specific exception instance.

Note: For remote The exception thrown by the calling type service must be serialized, otherwise the NoClassDefFoundError exception will be thrown on the client side, and the real exception information will be covered up.

10) Make the log easy to read and parse

. Interested in the log can be divided into Two categories:

  • people (such as programmers)
  • Machine (shell script written by system administrator)



The content of the log must take care of both groups. To quote Uncle Bob's book "Clean Code( http://www.amazon.com/Clean-Code-Handbook-Software-Craftsmanship/dp/0132350882 )": Logs should be as readable as code and easy to understand.

On the other hand, if a system outputs hundreds of MB or even gigabytes of logs per hour, then we need to use grep, sed and awk to analyze the logs. If possible, we should let Logs should be understood by humans and machines as much as possible. For example, avoid formatting numbers, and use log patterns to facilitate identification with regular expressions. If this is not possible, output data in two formats, such as the following:

log.debug("Request TTL set to: {} ({})", new Date(ttl), ttl);
// Request TTL set to: Wed Apr 28 20:14:12 CEST 2010 (1272478452437)
 
final String duration = DurationFormatUtils.formatDurationWords(durationMillis, true, true);
log.info("Importing took: {}ms ({})", durationMillis, duration);
//Importing took: 123456789ms (1 day 10 hours 17 minutes 36 seconds)



The above log not only takes care of the computer ("ms after 1970 epoch"), but also makes it easier for people to understand ("1 day 10 hours 17 minutes 36 seconds"). In addition, here is an advertisement for DurationFormatUtils ( http://commons.apache.org/lang/api-release/org/apache/commons/lang/time/DateFormatUtils.html ), a very nice tool :)

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326601992&siteId=291194637