ELK analysis nginx logs (2)

1, ES Basic Operation

Elasticsearch concept

  1. Index -> similar to the Mysql database in

  2. Type -> data tables similar Mysql

  3. Documentation -> Data storage

ES&kibana

Testing the Web Interface

  1. Browser access

  2. Kibana Action: GET /effect appears as shown below, and description kibana ES successful linkage.

Index Operations

//创建索引
PUT /zhang

//删除索引: 
DELETE /zhang

//获取所有索引: 
GET /_cat/indices?v

CRUD

ES insert data

PUT /zhang/users/1
{
 "name":"zhanghe", 
 "age": 23
}

ES query data

GET /zhang/users/1
GET /zhang/_search?q=*

Modify data, covering

PUT /zhang/users/1
{
 "name": "justdoit",
 "age": 21
}

ES delete data

DELETE /zhang/users/1

Modify a field, does not cover

POST /zhang/users/1/_update
{
 "doc": {
  "age": 22
 }
}

Modify all data

POST /zhang/_update_by_query
{
 "script": {
  "source": "ctx._source['age']=30" 
 },
 "query": {
  "match_all": {}
 }
}

Add a field

POST /zhang/_update_by_query
{
 "script":{
  "source": "ctx._source['city']='hangzhou'"
 },
 "query":{
  "match_all": {}
 }
}

And the above method does not use a lot of our operation and maintenance personnel during operation, most just know you can.

3、nginx

Custom extraction fields

nginx log on kibana display is a whole, not cut, we have to be cut to nginx logs by regular expressions, requires us to be familiar with regular expressions and nginx log contents, which are the basis of content, these are no longer repeat.

Grok extract Nginx log

1, Grok use (? Extract content) to extract the xxx field

2, extract the client IP: (? [0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3})

3, extraction time: [(? [^ ]+ +[0-9]+)]

Grok extract Nginx log

(?<clientip>[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}) - - \[(?<requesttime>[^ ]+ \+[0-9]+)\] "(?<requesttype>[A-Z]+) (?<requesturl>[^ ]+) HTTP/\d.\d" (?<status>[0-9]+) (?<bodysize>[0-9]+) "[^"]+" "(?<ua>[^"]+)"

Extract log Tomcat, etc. A similar approach

Logstash regular extraction Nginx logs

input {
 file {
  path => "/var/log/nginx/access.log"
 }
}
filter {
  grok {
    match => {
     "message" => '(?<clientip>[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}) - - \[(?<requesttime>[^ ]+ \+[0-9]+)\] "(?<requesttype>[A-Z]+) (?<requesturl>[^ ]+) HTTP/\d.\d" (?<status>[0-9]+) (?<bodysize>[0-9]+) "[^"]+" "(?<ua>[^"]+)"'
    } 
  }
}
output {
 elasticsearch {
  hosts => ["http://192.168.80.20:9200"]
 }
}

Logstash regular extraction error is not output to the ES

echo "shijiange" >> /usr/local/nginx/logs/access.log
output{
  if "_grokparsefailure" not in [tags] and "_dateparsefailure" not in [tags] {
​    elasticsearch {
​      hosts => ["http://192.168.237.50:9200"]
​    }
  }
}

Effects shots are as follows, a lot more custom fields:

Remove the field

We logstash profile fields gave them the message split is over, and the ES does not need to end the message and then complete the information store down, we can be removed.

Note the removal of field

  1. It can only be removed in the _source

  2. Can not be removed in the non-_source

Logstash configuration field to remove unwanted

filter {
  grok {
    match => {
        "message" => '(?<clientip>[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}) - - \[(?<requesttime>[^ ]+ \+[0-9]+)\] "(?<requesttype>[A-Z]+) (?<requesturl>[^ ]+) HTTP/\d.\d" (?<status>[0-9]+) (?<bodysize>[0-9]+) "[^"]+" "(?<ua>[^"]+)"'
    }
     remove_field => ["message","@version","path"]
  }
}

Removal of benefits field:

ES reduce the size of the database

Enhance the search efficiency

Timeline

ELK default timeline

  1. Time to send log shall prevail

  2. Nginx itself and the record of the user's access time

  3. Analysis Nginx logs on to the user's access based on the time, without the time to send log

Logstash默认是只是会记录最新出现的的日志,以往的日志并不会发送到ES上,但是如果我们有这个需求,要进行一个全量的分析,也是可以的:

input {
 file {
  path => "/usr/local/nginx/logs/access.log"
  start_position => "beginning"
  sincedb_path => "/dev/null"
 }
}

记录日志的时间和日志发送到ES的时候是不一致的,而kibana只会根据发送的发送的时间呈现图表,这不方便我们观看,所以我们要用日志里面的时间覆盖日志发送到ES的时间,这样图表呈现出来时才符合我们的需求。

Logstash的filter里面加入配置24/Feb/2019:21:08:34 +0800

filter {
  grok {
​    match => {
​      "message" => '(?<clientip>[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}) - - \[(?<requesttime>[^ ]+ \+[0-9]+)\] "(?<requesttype>[A-Z]+) (?<requesturl>[^ ]+) HTTP/\d.\d" (?<status>[0-9]+) (?<bodysize>[0-9]+) "[^"]+" "(?<ua>[^"]+)"'
​    }
​    remove_field => ["message","@version","path"]
  }
  date {
​    match => ["requesttime", "dd/MMM/yyyy:HH:mm:ss Z"]
​    target => "@timestamp"
  }
}

统计Nginx的请求和网页显示进行对比

cat /usr/local/nginx/logs/access.log |awk '{print $4}'|cut -b 1-19|sort |uniq -c

不同的时间格式,覆盖的时候格式要对应

20/Feb/2019:14:50:06 -> dd/MMM/yyyy:HH:mm:ss
2016-08-24 18:05:39,830 -> yyyy-MM-dd HH:mm:ss,SSS

Guess you like

Origin www.cnblogs.com/yizhangheka/p/12236167.html