Real-time data analysis: log monitoring warning system (a)

1. Background

A good software needs to have the characteristics:

1: software usability

The birth of the software is to solve specific problems, such as the now popular MVC framework, not MVC early development when a large coupling degree, post-maintenance update to engage in cost, difficult, so it is born out MVC framework; for example OA system Combined with the company's problem-solving; such as QQ, micro-channel, real-time chat to solve the problems of the people of remote communication.

2: The stability of the software

After the question of the utility software solutions, the urgent need to solve the problem is the stability of the software. General online system will carry a business enterprise, a direct impact on the stability of the system if the business can run properly.

3: Specification of the code

Iron barracks of soldiers, to achieve more than just a good software functions. Overall architecture, a function block, code comments, scalability and other issues also need to consider, after all, in a software life cycle, involved too many, creative staff also subject to sixty. Therefore, it is commendable normative code.

4: Upgrade to maintain forward compatibility

If a good software normally used, but the upgrade is increasingly amorphous, or greatly reduced stability after upgrading, in order to obtain also a good software.

5: Basic Manual

Documents, a simple and effective use of the manual, is king of the program. Allows users to clear, functional, architecture, design ideas, code, and so on.

 

2, needs analysis:

With the company's business development, supporting a variety of business systems more and more, in order to depressions company's normal business, an urgent need to monitor the operation of these systems now on, so that the problem even if found, to maximize reduce the impact on the business.

The current classification system are:

1) There are Tomcat-based web application

2) an independent java Application Application

3) have run the script in linux

4) there is a large-scale clustering framework (zookeeper, Hadoop, Storm, SRP .....)

5) operating system running log of (top)

The main functional requirements are divided into: content monitoring system log, filtered according to certain rules, we found the problem after an alarm via SMS and e-mail.

 

3, functional analysis:

data input:

                Log Data: Use flume client acquires data of each system;

                Rules data: user input system name / CEO departure rules by page;

Data storage:

                Use flume data collected and stored in kafka cluster

Data calculation:

                Use storm programming over the log, will meet the information filtering rules, by e-mail and SMS alerts saved to the database.

data demonstration:

                Management page to view the information set out rules. System is responsible for people, contact information, trigger information and other details

 

4, the overall architecture design

The main application architecture: flume + kafka + storm + mysql + java web data flow is as follows:

1: The application uses log4j to generate log

2: Deployment log flume client monitoring applications generated and sent to kafka cluster

3: storm spout pulling kafka consumption data, one by one filtering rule is determined for each log, a log of line with the rules for alert messages.

4: Save the last warning message to the mysql database to be managed.

 

Flume design

flume is a distributed, secure, available services, to collect, aggregate, log data transmission.

It is based on a data-flow architecture, simple and flexible. It is robust. Fault tolerance, failover, recovery mechanisms. It offers a simple and scalable database model that allows online analysis program

a1.sources = r1
a1.channels = c1
a1.sinks = k1

a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /export/data/flume_source/click_log/info.log
a1.sources.r1.channels = c1
a1.sources.r1.interceptors = i1
a1.sources.r1.interceptors.i1.type = cn.itcast.flume.AppInterceptor$AppInterceptorBuilder
# 用来标识 日志所属系统
a1.sources.r1.interceptors.i1.appId = 1 

a1.channels.c1.type=memory
a1.channels.c1.capacity=10000
a1.channels.c1.transactionCapacity=100

a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.topic = log_monitor
a1.sinks.k1.brokerList = HADOOP01:9092
a1.sinks.k1.requiredAcks = 1
a1.sinks.k1.batchSize = 20
a1.sinks.k1.channel = c1

Interceptor

package cn.itcast.flume;

import org.apache.commons.lang.StringUtils;
import org.apache.flume.Context;
import org.apache.flume.Event;
import org.apache.flume.interceptor.Interceptor;
import java.io.UnsupportedEncodingException;
import java.util.ArrayList;
import java.util.List;

/**
 * 1、实现一个InterceptorBuilder接口
 * 2、InterceptorBuilder中有个configuref方法,通过configure获取配置文件中的相应key。
 * 3、InterceptorBuilder中有个builder方法,通过builder创建一个自定义的MyAppInterceptor
 * 4、AppInterceptor中有两个方法,一个是批处理,一个单条处理,将批处理的逻辑转换为单条处理
 * 5、需要在单条数据中添加 appid,由于appid是变量。需要在AppInterceptor的构造器中传入一些参数。
 * 6、为自定义的AppInterceptor创建有参构造器,将需要的参数传入进来。
 */
public class AppInterceptor implements Interceptor {
    //4、定义成员变量appId,用来接收从配置文件中读取的信息
    private String appId;
    public AppInterceptor(String appId) {
        this.appId = appId;
    }
    /**
     * 单条数据进行处理
     * @param event
     * @return
     */
    public Event intercept(Event event) {
        String message = null;
        try {
            message = new String(event.getBody(), "utf-8");
        } catch (UnsupportedEncodingException e) {
            message = new String(event.getBody());
        }
        //处理逻辑
        if (StringUtils.isNotBlank(message)) {
            message = "aid:"+appId+"||msg:" +message;
            event.setBody(message.getBytes());
            //正常逻辑应该执行到这里
            return event;
        }
        //如果执行以下代码,表示拦截失效了。
        return event;
    }

    /**
     * 批量数据数据进行处理
     * @param list
     * @return
     */
    public List<Event> intercept(List<Event> list) {
        List<Event> resultList = new ArrayList<Event>();
        for (Event event : list) {
            Event r = intercept(event);
            if (r != null) {
                resultList.add(r);
            }
        }
    return resultList;
}

    public void close() {
    }
    public void initialize() {
    }

    public static  class AppInterceptorBuilder implements Interceptor.Builder{
        //1、获取配置文件的appId
        private String appId;

        public Interceptor build() {
            //3、构造拦截器
            return new AppInterceptor(appId);
        }
        public void configure(Context context) {
            //2、当出现default之后,就是点击流告警系统
           this.appId =  context.getString("appId","default");
            System.out.println("appId:"+appId);
        }
    }
}

Kakfa Design

kafka is a distributed message queue: producers, consumers function

Fragments and copy number is defined:

1) create a topic all_app_log

2) Specify the data slice, the specified number of copies (assuming there 3T actual log scale) requires three copies of 10 fragments

                 3T log data

                 The average amount of data per second: 3T / 24H = 0.125T = 125G / 60 M = 20G / 60S = 20000M / 60S = 333M

                 Peak amount of data: 333M * 3

                 Consider activities Peak: 333M * 3 * 3

                 Conclusion: The second process data required to 3G; theoretical amount kafka value data is 600m / S, actual translation 300M / S

                 3G / 300M = 10 fragments

3) How to calculate the number of copies

                 broker number greater than 10, only the number of copies is 2 

                 broker number less than 10, the number of copies required is 3 

4) kafka policy data stored for 168 hours data storage 3 * 3T * 7D = 63T

 

kafka-topics.sh --create --zookeeper hadoop01:2181 --replication-factor 3 --partitions 10 --topic all_app_log

 

 

 

Published 33 original articles · won praise 3 · Views 5845

Guess you like

Origin blog.csdn.net/WandaZw/article/details/84715481