Graylog source code analysis

Graylog above describes the function and structure, this article, we take a look at the source code Graylog

A project started (CmdLineTool)

 Started doing this a few basic things: initialize the logger, plug-load (This uses Java SPI mechanism), performance metrics Metrics initialization (using a codahale metrics, the open source software used in the

Pretty much, Kafka also used this), and finally the use of performance monitoring JMXReporter will be exposed to JMX.

  1. Plug-Load (CmdLineTools category):

  Graylog a custom ClassLoader used to load the plug-in (ChainingClassLoader) in the specified directory will be loaded after memory plug to make a simple version check.

         

  Before mentioned Graylog plug-in uses a Java SPI mechanism can be seen in this class PluginLoader:

 

  Here, finally we saw the familiar ServiceLoader class, mechanism of SPI interested friends, you can search for relevant articles.

    2.Rest interface service (JerseyService category):

      Rest aspects, Graylog using Jersey offered web service, Jersey in the country seems to have been tepid, but foreign open source projects used in pretty much.

  

 Project started on the introduction to here, Graylog in terms of dependency injection, we use a large number of Google Guice framework, but I've only heard his name on Guice, have the opportunity to study it :).

Journal of the two mechanisms .Graylog

     Generally, in the project, if you encounter a large number of log processing problems, we are likely to choose to do Kafka message queue, but some customers with limited system resources, is clearly a cluster message queue

The luxury of choice, Graylog's approach is interesting, it does not fully implement its own set of message queue mechanism, but the use of Kafka log processing low-level API, you can think of, Graylog

Kafka will do some of the work (disk log management, log buffer, regular cleaning, etc.) into their own in the process carried out.

Familiar with Kafka's friend to see here should not unfamiliar, Graylog Like Kafka, log on disk is divided into Segments management.

 We look at Graylog file written to disk, you will find and Kafka are no different

 

 

In addition Graylog there PeriodicalsService Timer service (all timing mandate system), ActivityWriter user operation warehousing services .etc, relatively simple, not in this list ..

 

III. The transfer of data Graylog

   Having said that, Graylog since it is a log processing software, a log entry from outside the system what the process flow after Graylog server is it?

   I will Graylog processing logs a simple hierarchical, data processing flow is roughly:

  Original external data system -> Transport (Data Transport Layer) -> the Input (Data Access Layer) -> InputBuffer (Access Stratum buffer ringBuffer) -> Encoder / Decoder (Codec Data Layer) -> comes Kafka ( optional) -> process buffer (buffer layer service processing ringBuffer) -> ProcessBufferProcessor (log service processor) -> OutputBuffer (log output / storage / forward buffer ringBuffer) -> OutputBufferProcessor (output / storage / transport processor)

 

Below to Kafka access logs, for example, look at the data in the entire processing flow of graylog:

 

  1. Log access stratum (KafkaTransport):

2. Data Access into the buffer layer (MessageInput)

   

InputBufferImpl

 

3. The decoding processor log + + log service processor writes comes Kafka, by Disruptor Handler (InputBufferImpl)

 

 4. The log buffer write directly to the business logic RingBuffer (not through Kafka) 

The log writing kafka, consumption by a subsequent process (JournallingMessageHandler)

 6. a background thread continuously reading data from Kafka's own, the next is written to Buffer in process (JournalReader class)

   

7. Business processor ProcessBufferProcessor (graylog all business processes the log are bound to this class, such as log filtering, rule, threat intelligence enrichment, location enrichment, knowledge ...)

 

  Specific processor implementation is more complex, it put the final say.

8. Data output / forward / storage Buffer

A data output (OutputBufferProcessor)

   There may be a plurality of output data, the output of the process is asynchronous to the output and a time limit will not affect the overall system throughput.

To write ES, for example (BlockingBatchedESOutput)

In addition, there is a thread is responsible for the system timing data memory to flush ES, where the code is not posted.

 

10. MessageProcessor

The data processor includes a system comes GeoIpProcessor, MessageFilterChainProcessor, PipelineInterpreter

 (1) GeoIpProcessor: enrichment data (original log add a location, using the subsequent visualization)

 

(2) MessageFilterChainProcessor (log contains all the MessageFilter filters)

   

Log eleven sorted through the filter, if the filter condition is satisfied, marked as discarded and updates kafka offset. Analysis filters one by one below.

 

Guess you like

Origin www.cnblogs.com/showing/p/10318277.html