Logstash installation deployment configuration


       In order to realize the collection of log information of each business platform to the big data platform hdfs. The previous fixed technology stack is flume->kafka->storm->hdfs. It is necessary to write code through storm, which is not stable, scalable, and maintainable.
From kafka to hdfs, there is a special logging tool logstash that can solve this problem. It has been running stably in our development environment for a week (http://192.168.23.31:50070/explorer.html#/data/logstash).
Deploy to production now.

The installation and configuration of logstash are as follows:
1. Download and install logstash. If the download is slow, you can pass it to you.
wget -c https://download.elastic.co/logstash/logstash/packages/centos/logstash-2.3.4-1.noarch.rpm
rpm -ivh logstash-2.3.4-1.noarch.rpm

2. logstash hdfs Plugin download and install
git clone https://github.com/heqin5136/logstash-output-webhdfs-discontinued.git
cd logstash-output-webhdfs-discontinued
/opt/logstash/bin/plugin install logstash-output-webhdfs

3.logstash configuration
vim /etc/logstash/conf.d/logstash.conf

input {
  kafka {
    zk_connect =>"192.168.1.50:2181,192.168.1.51:2181,192.168.1.52:2181" #kafka's zk cluster address, please change to production environment
    group_id => "hdfs" #Consumer group, don't Same as consumers on ELK
    topic_id => "flume_kafka_channel_topic" #kafka topic, change to production environment
    consumer_id => "logstash-consumer-192.168.23.31" #consumer id, custom
    consumer_threads => 1
    queue_size => 200
    codec = > plain{ charset => "UTF-8" }
    auto_offset_reset => "smallest"
  }
}

filter {
     grok {
       match => { "message" =>
        #"%{TIMESTAMP_ISO8601:date} (?<thread_name>.+?\bhost\b.+?)(?<thread>.+?\bu001Cbi\b)(?<action>.+?\baction\b) (?<type>.+?\btype\b)(?<content>.*)"
        "(?<thread>.+?\bu001Cbi\b)(?<action>.+?\baction\b)( ?<type>.+?\btype\b)(?<content>.*)"
       }
    }
}

output { #If
you have several kinds of logs in a topic, you can extract them and store them separately on hdfs.
if [action] == "-action" and [type] == "-type" {
    webhdfs {
           workers => 2
           host => "192.168.23.31" #hdfs namenode address, change to production environment
           port => 50070 # webhdfs port
           user => "root"
           path => "/data/logstash/log-%{+YYYY}-%{+MM}/apiLog-%{+YYYY}-%{+MM}-%{+dd}.log" #Create a directory by month , build log files by day.          
           flush_size => 500
           #compression => "snappy" #Compression format, can not be compressed
          idle_flush_time => 10
          retry_interval => 0.5
          codec => plain{ charset => "UTF-8" }
    }
}
}


4. logstash configuration check, start , stop
/etc/init.d/logstash configtest start stop

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326641685&siteId=291194637
Recommended