Java log series (2): Specifications and precautions that need to be followed when using Java logs

In the previous article - "Java Logging Series 1: Detailed explanation of the mainstream logging frameworks Log4j, Log4j 2, JUL, Commons Logging and Slf4j&Logback" , the author introduced commonly used logging frameworks. As a continuation of the logging topic, this article will be combined with specific cases Introduce the use of logs.

1. Log format and level

When using the log framework, you can customize information such as log printing format and log level in the log configuration file according to the requirements of the application. As shown below, a sample logback.xml configuration is provided, which explains the key information in the configuration file. Examples of other log framework usage will be introduced in the next section.

<?xml version="1.0" encoding="UTF-8"?>
<configuration debug="false">
<!--定义日志文件的存储地址 勿在 LogBack 的配置中使用相对路径-->
<property name="LOG_HOME" value="/home" />
<!-- 控制台输出 -->
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<encoder class="ch.qos.logback.classic.encoder.PatternLayoutEncoder">
<!--格式化输出:%d 表示日期,%thread 表示线程名,%-5level:级别从左显示 5 个字符宽度,%msg:日志消息,%n 是换行符-->
<pattern>%d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %-5level %logger{50} - %msg%n</pattern>
</encoder>
</appender>
<!-- 按照每天生成日志文件 -->
<appender name="FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
<!--日志文件输出的文件名-->
<FileNamePattern>${LOG_HOME}/TestWeb.log.%d{yyyy-MM-dd}.log</FileNamePattern>
<!--日志文件保留天数-->
<MaxHistory>30</MaxHistory>
</rollingPolicy>
<encoder class="ch.qos.logback.classic.encoder.PatternLayoutEncoder">
<!-- 日志输出格式:%d 表示日期时间,%thread 表示线程名,%-5level:级别从左显示 5 个字符宽度 %logger{50} 表示 logger 名字最长 50 个字符,否则按照句点分割。 %msg:日志消息,%n 是换行符 -->
<pattern>%d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %-5level %logger{50} - %msg%n</pattern>
</encoder>
<!--日志文件最大的大小-->
<triggeringPolicy class="ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy">
<MaxFileSize>10MB</MaxFileSize>
</triggeringPolicy>
</appender>

<!-- 日志输出级别 -->
<root level="INFO">
<appender-ref ref="STDOUT" />
</root>
</configuration>

1.1 Log naming and retention period

Log specifications are also introduced in the public Alibaba Java development manual. Here, the author introduces two of the specifications, namely log naming method and log retention period.

  • Log naming method:
appName_logType_logName.log。

Among them, appName represents the application name; logType represents the log type. Recommended categories include stats, monitor, visit, etc.; logName is the log description. The advantage of this naming is that it can quickly and clearly understand the log file type and purpose, making it easy to classify and search.

  • Log retention period:

How to determine the log retention period is a tricky issue. If the log storage time is too long, it will consume a lot of storage resources and even cause excessive disk pressure and affect system stability; if the log storage time is too short, it may lead to "loss" of log data. , it is impossible to trace back when a problem occurs. The Alibaba Java Development Manual recommends that log files be retained for at least 15 days. In actual applications, you can adjust it according to the importance of the log file, file size and disk space. In addition, the log can also be monitored and dumped regularly.

1.2 Log level

Usually, the priority of logging is divided into OFF, FATAL, ERROR, WARN, INFO, DEBUG, ALL or customized levels. Log4j recommends using only four levels. The priorities from high to low are ERROR, WARN, INFO, and DEBUG. By setting the log level in the configuration file of the log framework, you can control the switch of the corresponding level of log information in the application. For example, if it is configured as INFO level, only logs equal to or higher than this level will be processed, and all DEBUG level log information in the application will not be printed. It should be noted that the log level is not only related to the "level of detail", but also related to applicable scenarios, service objects, etc. Descriptions of common log levels are as follows:

  • ALL : Print all logs.
  • OFF : Turn off all log output.
  • ERROR : Error information, including error type, error content, location and scenario, whether it is recoverable, etc. Only when the error will affect the normal operation of the system, it will be output as information.
  • WARN : serves as a reminder that although the application is currently running normally, there are hidden risks.
  • INFO : Record the basic operating process, operating status, and key information of the system.
  • DEBUG : Detailed information about the system's running process and status, which can be used for debugging.
  • TRACE : Detailed information about the system structure and content, such as the contents of some key objects, function call parameters, results, etc.

1.3 Log format

You can control the content and style of output log information by configuring the log output format. Take the above log configuration file logback.xml as an example. The pattern tag defines the output format of the log. The default parameters are as shown in the following table.

parameter meaning
%d The date or time of the output log time point, such as: %d{yyyy-MM-dd HH:mm:ss.SSS}
%thread Output the name of the thread that generated the log event
%-5level Log level, displayed 5 characters wide from left
%msg Output the log message, that is, the message specified in the output code
%n Outputs a carriage return and line feed character, which is "\r\n" on the Windows platform and "\n" on the Unix platform.

In addition to the above default log format parameters, Logback also supports custom log format configuration. For example, if you want each log to print the IP address, the implementation method is as follows:

  • step1: Create a new class IPLogConfig, override the convert method, and achieve ip acquisition.
public class IPLogConfig extends ClassicConverter {
    
    
    @Override
    public String convert(ILoggingEvent event) {
    
    
        try {
    
    
            return InetAddress.getLocalHost().getHostAddress();
        } catch (UnknownHostException e) {
    
    
            e.printStackTrace();
        }
        return null;
    }
}
  • step2: Modify logback.xml configuration.
<!--配置规则类的位置-->
<conversionRule conversionWord="ip" converterClass="com.test.conf.IPLogConfig" />
<appender name="Console" class="ch.qos.logback.core.ConsoleAppender">
   <layout>
        <pattern>[ip=%ip] %d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %-5level %logger{50} - %msg%n</pattern>
   </layout>
</appender>

Compared with the log output format configuration of Logback, the output format configuration of log4j is different. The main parameter configuration is as follows:

parameter meaning
%m Outputs the message specified in the code
%p Output priority, namely DEBUG, INFO, WARN, ERROR, FATAL
%r Output the number of milliseconds it takes from the time the application is started to when the log information is output.
%c The category to which the output belongs, usually the full name of the category
%t Output the name of the thread that generated the log event
%n Outputs a carriage return and line feed character, which is "\r\n" on the Windows platform and "\n" on the Unix platform.
%d The date or time of the output log time point, the default format is ISO8601, you can also specify the format afterwards, for example: %d{yy MMMM dd HH:mm:ss,SSS}
%l Output the location where the log event occurred, including the category name, the thread where it occurred, and the line number in the code
%F The name of the file in which the output log message was generated

2. Precautions for using logs

This section will use examples to introduce the basic matters that need to be paid attention to in log applications, such as exception recording, object instance recording, log monitoring, log classification, etc.

2.1 How to record exception logs

When logging an exception, be sure to output the exception stack information. Without complete stack information, once an exception occurs in the application, it will be difficult for maintenance personnel to locate the problem. Example: An application splits the processing link into function points, and an exception will be thrown if each function point fails. Classify and record exception logs at the service entrance.

try {
    
    
        this.startOrderProcess(request, result, processName);
} catch (ProductBizException e) {
    
    
        if (CollectionUtils.isNotEmpty(e.getErrorMessages())) {
    
    
                e.getErrorMessages().forEach(errorMessage -> {
    
    
                    log.error("biz process error" + e.getMessage(), e);
                });
        }
} catch (ProductSystemException ex) {
    
    
        log.error("system error" + ex.getMessage(), ex);
} catch (TMFRuntimeException e) {
    
    
        ErrorMessage errorMessage = e.getErrorMessage();
        if (null != errorMessage) {
    
    
                log.error("tmf runtime error" + e.getMessage(), e);
        }
}

2.2 How to record object instances

If you output an object instance in the log, make sure that the instance class overrides the toString method, otherwise only the hashCode value of the object will be output, which is meaningless. In addition, the corresponding properties can also be obtained through java reflection. The main benefit is that when adding or modifying attributes, there is no need to modify the toString method. Fastjson is usually used to convert the object into a json object.

Example: In a certain project, there is a debug level log information that needs to record the request parameters of the service caller. Therefore, the toString method is overridden on the ProductQueryRequest object instance to obtain complete object information.

// 采用 slf4j 中占位符输出形式记录 debug 信息
logger.debug("query request: {}", productQueryRequest);

// 其中 ProductQueryRequest 对象重写了 toString 方法
@Override
public String toString() {
    
    
        return "ProductQueryRequest{" +
            "queryOption=" + queryOption +
            ", productIds=" + productIds +
            ", MemberId=" + MemberId +
        '}';
}

2.3 How to classify log records

It is recommended that log classification be functionally divided into monitoring logs, statistical logs, and access logs.

  • Monitoring log : The monitoring log here does not include statistical information. Of course, both statistical information and access information can be configured as monitoring information. Monitoring logs are what developers need to focus on, because monitoring logs are directly related to the stability and operation and maintenance difficulty of the system. Typical monitoring logs include request entry and exit; external service calls and returns; resource consumption operations: such as reading and writing files, etc.; fault-tolerance behaviors: such as cloud disk copy repair operations; program exceptions: such as database failure to connect; background operations: regularly executed Deleted threads; startup, shutdown, configuration loading, etc.
  • Statistics log : user access statistics, such as user IP, amount of uploaded and downloaded data, request qps, request rt, etc. Statistical logs need to have strict format requirements to facilitate statistics. In practice, the log format should be designed according to the specific log analysis platform to facilitate statistical analysis of data.
  • Access log : This type of log is generally recorded directly at the nginx layer, such as the application's access.log, and the data format is basically unified. You can also use relevant log analysis platforms to perform statistical analysis on access logs.

2.4 How to determine log level

In practice, the four commonly used log levels are debug, info, warn, and error. So how to specifically determine the attribution level of the log?

  • error : record more important error information (note: failure to pass the parameter check is not an error). Under normal circumstances, the occurrence of exceptions can be considered as error level. For example, querying the user information service to obtain user information fails, reading file information returns IOException, and function modules execute abnormally, etc.
  • warn : Record more important prompt information, such as request parameter verification failure, some unimportant exceptions, abnormal conditional branches (the business process does not meet expectations), the result obtained by the request service is empty (potential risks), etc.
  • info : records information that can be used for business data statistics, monitoring, and locating general problems, such as system status change logs, core processing and key actions of business processes, and status changes of business processes.
  • debug : Record debugging information, such as request/response. Usually, the debug log will be turned on during development and in the early stages of launch. As the system stabilizes, the debug switch will be turned off. It will only be turned on as needed when difficult problems arise.

2.5 How to monitor log data

By monitoring the keywords in the logs, system faults can be discovered and alarmed in time, which is crucial for system operation and maintenance. Service monitoring and alarming is a big topic. This section only introduces some issues that need to be paid attention to when log monitoring and alarming:

  • It is not necessary and does not send an alarm . Only errors that require immediate operation and maintenance need to send an alarm. The reason for this is to avoid long-term alarm harassment that makes operation and maintenance personnel no longer sensitive to alarm calls, and eventually turns into a "wolf crying" story;
  • Clear alarm keywords , such as using ERROR as the alarm keyword, instead of various complex rules. The reason for this is that log monitoring is essentially a continuous string matching operation. If there are too many and complex rules, it may have an impact on online services;
  • For some early warning operations , such as a service that needs to be retried multiple times to succeed, or a user's quota is almost exhausted, etc., feedback can be provided through an alarm email every day;
  • Every time a system failure occurs , it is necessary to promptly check whether the log alarm is sensitive, whether the description of the log alarm is accurate, etc., and continuously optimize the log alarm;

Monitoring configuration specifications

project specification
Coverage Monitoring usually requires 100% coverage of major problems, failures, financial losses, and user complaints.
Hierarchical monitoring Monitoring content should cover system monitoring, JVM monitoring, key middleware monitoring, cluster link monitoring, dependent upstream and downstream business monitoring, and own business monitoring
multidimensional analysis Monitoring forms include: offline and pre-release environments can automatically run full business monitoring, online important functions short-cycle scheduled stand-alone monitoring, online full function cycle automated execution monitoring, online traffic error rate and other large-scale analysis indicators real-time monitoring, offline analysis Business indicator market monitoring
False alarm rate After monitoring is configured, it is necessary to follow up on the data results and set appropriate early warnings. After early warnings are configured, optimization must be continued until false alarms no longer occur.

2.6 Log file size

Log files should not be too large. Excessively large log files will reduce the efficiency of log monitoring and problem location. Therefore, log files need to be segmented. Specifically, whether to segment by day or by hour can be decided based on the amount of logs. The principle is to facilitate development or operation and maintenance personnel to quickly find logs. As shown in the following configuration, the log file size is defined as 20M, the file is divided by day, and the data of the last 15 days is retained.

<rollingPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy">
       <fileNamePattern>${LOG_FILE}.%d{yyyy-MM-dd}.%i.log</fileNamePattern>
       <maxHistory>15</maxHistory>
       <maxFileSize>20MB</maxFileSize>
       <totalSizeCap>20GB</totalSizeCap>
</rollingPolicy>

In order to prevent log files from filling up the entire disk space, log files need to be deleted regularly. For example, when a disk alarm is received, log files from a week ago can be deleted or dumped. In practice, log dumping/deletion should be automated. When system monitoring finds that disk space usage exceeds a set threshold, dumping/deletion will be performed based on the date marked on the log file name.

Guess you like

Origin blog.csdn.net/Jin_Kwok/article/details/132796480