E-commerce summary - solution of log monitoring system

    The monitoring system is mainly used for resource and performance monitoring of server clusters, application exception and performance monitoring, log management and other multi-dimensional performance monitoring and analysis. I don't need to say more about the importance of a perfect monitoring system and log system to a system. In a word, only by understanding the status of each system in real time can the stability of each system be guaranteed.

   

  As shown in the figure above, the monitoring platform monitors a wide range of server performance and resources, as well as application system monitoring. Each company has specific platform requirements and solutions for unified monitoring, but the tasks and functions of the monitoring platform are basically the same.

 

  One, log

  The log is an important way to monitor the running of the program. It has two main purposes: 1. Timely discovery and location of bugs, and 2. Displaying the running status of the program. Correctly detailed logging can quickly locate problems. Similarly, by looking at the log, you can see what the program is doing and whether it is executing as expected, so it is necessary to record the running status of the program. There are two types of logs here: 1. Exception logs and 2. Operation logs.

  We mainly use log4net to persistently record the logs of each system to the database or file to facilitate subsequent system exception monitoring and performance analysis. How to integrate log4net will not be explained here.

  Several principles of logging:

    1. The log level must be clearly distinguished, which belong to error, warning, info, etc.,

    2. Record the location of the error. If it is a layered system, it must be handled uniformly at a certain layer. For example, our mvc architecture catches exceptions in each action and handles exceptions in the business layer and database layer. , are all thrown to the upper layer after the catch to the exception.

    3. The log information is clear, accurate and meaningful, and the log should be as detailed as possible to facilitate processing. The relevant system, module, time, operator, stack information, etc. should be recorded. Facilitate subsequent processing.

 

  Second, monitoring

  The monitoring system is a complex system platform, and there are many open source products and platforms. However, our platform is small, and there are few monitoring tasks and requirements, so we basically develop it ourselves, mainly in these five aspects: 1. System resources, 2. Server, 3. Service, 4. Application exception, 5. Application performance.

  The specific architecture diagram is as follows:

  

  1. System resource monitoring

    Monitor various network parameters and related resources of each server (cpu, memory, disk read and write, network, access requests, etc.) to ensure the safe operation of the server system; and provide an exception notification mechanism to allow system administrators to quickly locate/resolve existing problems. kind of problem. At present, the more popular one should be zabbix. 

  2. Server monitoring

    The monitoring of the server is mainly to monitor whether the request response of each server, network node, gateway, and other network devices is normal.

    Through the timing service, ping each network node device regularly to confirm whether each network device is normal, and if any network device is abnormal, a message will be sent to remind.

  3. Service monitoring

    Service monitoring refers to whether various services of the platform system such as web services, image services, search engine services, and cache services are running normally.

    You can request related services at regular intervals to ensure the normal operation of various services of the platform.

  4. Application exception monitoring

    At present, the abnormal records of all systems of our platform are recorded in the database. Through regular service, statistical analysis of abnormal records within a period of time. If it is found that there are system abnormalities related to important modules, such as payment, ordering modules, and frequent exceptions, immediately notify the relevant personnel to deal with them to ensure the normal operation of the service.

  5. Application performance monitoring

    Intercept and record program performance (sql performance, or program execution efficiency) at the api interface and relevant positions of each application. Relevant important modules provide performance warnings to detect problems in advance. At the same time, relevant monitoring information is counted and displayed to developers to facilitate subsequent performance analysis.


Source: http://blog.csdn.net/haihongazar/article/details/52535750

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325988147&siteId=291194637