GoAccess real-time analysis using the web service log

1. What is GoAccess

GoAccess real-time Web log analysis tools and an open source interactive view of the interface, you can access through a terminal program (terminal) in your Web browser or * nix systems.
Provides fast and valuable HTTP statistics for system administrators, and visualization server by way of the online presentation.

2, why should GoAccess

GoAccess is designed to quickly and terminal-based log analysis tool. Its core idea is not required by the Web browser will be able to quickly analyze and view real-time statistics Web server (This is done to access logs for quick analysis or require the use of SSH is like working in a terminal environment for people who are awesome of).
Output terminal is just the default output, GoAccess also supports the generation of complete real-time HTML reports (this analysis, monitoring, and data visualization are excellent), and the report JSON, and CSV format.

3, GoAccess function

Output GoAccess resolve the specified Web log files and statistics to the X terminal. Functions are as follows:

  • General Statistics : This panel shows several key indicators, such as: the number of valid and invalid requests, analyze the situation of these data takes time, unique visitors, requested files, static files (CSS, ICO, JPG, etc.) complete URL, 404 errors, the size of the log file is parsed and bandwidth consumption.
  • Unique visitors : This panel shows the date according to the number of visits, number of unique visitors, as well as the cumulative bandwidth consumption indicators. The IP the same, the same access time, the same UserAgent HTTP request will be recognized as a unique visitor. By default, it contains the web crawler. May alternatively be used --date-spec = hr parameter to modify the hour analysis by date, for example: 05 / Jun / 2016: 16 . This is hoped to track hours daily traffic level is very helpful.
  • Requested the file : the highest number of requested documents on this panel display server. Comprising a number of access, the number of unique visitors, percentage, total bandwidth consumed, the protocol, the request method.
  • Static file requests : Lists the most frequently requested static file types, such as: JPG, CSS, SWF, JS , GIF, and PNG, as well as other indicators and a panel on the same. Also static files can be added to the configuration file.
  • 404 or file not found : showing the contents of the previous panel is similar, but it contains all the data page not found, and commonly known as the 404 status code.
  • Host : This panel shows details of the host itself. It can be a good find and identify malicious reptile who eat your bandwidth. Extended panel will show more information, such as host reverse DNS resolution result, the host country and city. If you turn parameter, select the IP address you want to view and press Enter, it will display a list of UserAgent.
  • Operating System : This panel will use the information from the host operating system display. GoAccess will try as much as possible to provide detailed information on each one operating system.
  • Browser : This panel displays information about the visiting browser used by the host. GoAccess will try as much as possible to provide detailed information for each browser.
  • Visits : this panel report by the hour. Thus the display 24 data points, each of which corresponds to a particular hour of each day. Use --hour-spec = min parameter may be set to report every ten minutes, and 16: 4 in the display time format. It is helpful to find the peak access time server.
  • Web Hosting : This panel displays the situation resolved from access logs out different virtual hosts. This panel is displayed only when enabled% v parameter in the log format.
  • Antecedents URL : If the problem host access your site through other resources, as well as through links from other hosts or jump to your site, these antecedents URL will be displayed in this panel. In the configuration file you can --ignore-panelturn this feature on. (Off by default)
  • Antecedents site : This panel displays only part of the host, rather than the full URL.
  • Keywords : report supports used in the Google search, Google cache, use the keyword on Google translation. Currently only Google search via HTTP. In the configuration file you can --ignore-panelturn this feature on. (Off by default)
  • Location : determining location from the IP address. Statistical data grouped by continent and country. Location need support module.
  • HTTP status codes : status code of the HTTP request to a digital representation.
  • Remote user (HTTP authentication) : to determine access to the document via HTTP authentication permissions. If the document is not password protected, this part will be displayed as "-." This panel is enabled by default unless a variable parameter log format% e.

4, GoAccess Features

  • Completely real : all panels and indicators are updated in accordance with the specified time interval, in the terminal (Terminal) is 200ms, HTML is second.
  • Supports almost all Web log format : GoAccess allow any custom log format. May be predefined format comprising: Apache, Nginx, Amazon S3, Elastic Load Balancing, CloudFront, etc.
  • Tracking Application Response Time : time tracking processing request consumption. Decline is useful for troubleshooting site page access speed.
  • Log growth process : the need to continue to maintain the data? GoAccess ability to handle log on disk storage and database B + Tree growing.
  • Rely solely on a module : GoAccess is written in C language. Ncurses only need to run this module. Meanwhile GoAccess even has its own Web Socket protocol RFC6455-compliant server.
  • Easy to use : You can run GoAccess directly to process your access log file, simply select the log format and then let GoAccess content and parse log statistics displayed.
  • Visitors portrait : You can determine the number of visits to the slowest requests, the number of visitors, bandwidth, and other relevant metrics by the hour or a specified date.
  • Support Web Hosting : have multiple virtual hosts? In the control panel can display which virtual host consume the most resources on the Web server.
  • Color scheme can be customized : GoAccess color style is very easy to customize. Whether it is through the terminal, or just by simply modifying the Cascading Style Sheets HTML pages.

5, the installation GoAccess

There are two ways to install: compile and install the source code and yuminstall
native environment

# cat /etc/redhat-release 
CentOS Linux release 7.6.1810 (Core) 

5.1, the installation source

Download package and compile

# wget -c https://tar.goaccess.io/goaccess-1.3.tar.gz
# tar -xzvf goaccess-1.3.tar.gz
# cd goaccess-1.3/
# ./configure --prefix=/usr/local/goaccess --enable-utf8 --enable-geoip=legacy --with-openssl

Error in turn resolved in accordance with dependencies
being given a:

configure: error: 
    *** Missing development files for the GeoIP library
# yum install -y GeoIP-devel

Error two:

configure: error: *** Missing development libraries for ncursesw
# yum install -y ncurses-devel

Compile and install again

# ./configure --prefix=/usr/local/goaccess --enable-utf8 --enable-geoip=legacy --with-openssl
...
Your build configuration:

  Prefix         : /usr/local/goaccess
  Package        : goaccess
  Version        : 1.3
  Compiler flags :  -pthread
  Linker flags   : -lnsl -lncursesw -lGeoIP -lcrypto -lssl -lpthread  
  Dynamic buffer : no
  Geolocation    : GeoIP Legacy
  Storage method : In-memory Hash Database (Default)
  TLS/SSL        : yes
  Bugs           : [email protected]
# make && make install
...
make[3]: Entering directory `/root/goaccess-1.3'
 /usr/bin/mkdir -p '/usr/local/goaccess/bin'
  /usr/bin/install -c goaccess '/usr/local/goaccess/bin'
 /usr/bin/mkdir -p '/usr/local/goaccess/etc/goaccess'
 /usr/bin/install -c -m 644 config/goaccess.conf config/browsers.list '/usr/local/goaccess/etc/goaccess'
 /usr/bin/mkdir -p '/usr/local/goaccess/share/man/man1'
 /usr/bin/install -c -m 644 goaccess.1 '/usr/local/goaccess/share/man/man1'
make[3]: Leaving directory `/root/goaccess-1.3'
make[2]: Leaving directory `/root/goaccess-1.3'
make[1]: Leaving directory `/root/goaccess-1.3'

5.1, yum install

By yumcan automatically resolve dependencies installed

# yum install goaccess -y

6, configuration

  • Configuration environment variable
# echo "export PATH=/usr/local/goaccess/bin:$PATH" >>/etc/profile
# source /etc/profile
# goaccess -V
GoAccess - 1.3.
For more details visit: http://goaccess.io
Copyright (C) 2009-2016 by Gerardo Orellana

Build configure arguments:
  --enable-utf8
  --enable-geoip=legacy
  --with-openssl
  • Custom log / date format

GoAccessYou can resolve any virtual Weblog format.
Predefined options include: Common Log Format, joint log format, including the virtual host, W3Cformat, and Amazon CloudFront(Distributed download).
GoAccessAllow any custom format string.

There are two ways to configure log format. The easiest way is to run GoAccessused -cto display a configuration window. However, this approach is not permanent, so you need to set the format in the configuration file.

Configuration files are located in: %sysconfdir%/goaccess.confor ~/.goaccessrc
Note: %sysconfdir%might be /etc/, /usr/etc/or/usr/local/etc/

time-formatParameters time-formatafter followed by a space character, the log time format specified, it contains common characters with special format described any combination of symbols. They consists of a percent sign (%) start. Reference man strftime. %TOr %H:%M:%S.
Note: If the given time stamp calculation in microseconds, you must time-formatuse the parameters %f.

date-format the date-format parameter followed by a space character, the log specified date format, comprising any combination of general and special character format specifier. They consists of a percent sign (%) start. Reference man strftime.
Note: If the time stamp calculation given in microseconds, you must time-formatuse the parameters %f.

log-formatParameters log-formatafter followed by a space or tab separator (\ t), specifies the log format string.

  • Special format specifier

  • % X matches the time-format and date-format variables date and time fields. Instead of using the time stamp for date and time of the scene of the two independent variables.
  • % T match time field time-format variable.
  • % D match the date field of the date-format variable.
  • % V is set according to the canonical name of the server name (or virtual host service area).
  • % E HTTP request determined by the document authentication user ID.
  • % H host (client IP addresses, IPv4 or IPv6).
  • % R number of rows requested by the client. These requests use delimiters (single and double quotation marks) moiety may be resolved by reference. Otherwise, the need to use a special format specifier (for example:% m,% U,% q and% H) format combinations to resolve separate fields.
  • Note:% r may be used for a complete request, may also be used% m,% U,% q and% H combination to your request, but not both.
  • Method% m requests.
  • % U requested URL.
  • Note: If the query string% U, you need to use% q. However, if the path does not contain any URL query string, then you can use the% q query string appended to the request later.
  • % Q query string.
  • % H request protocol.
  • % S server returns the client status code.
  • % B return the size of the client object.
  • % R HTTP request "Referer" value.
  • % U HTTP request "UserAgent" value.
  • % D processing request time consuming, the use of microseconds.
  • % T time consuming processing request, and the second band using milliseconds.
  • % L time consuming processing requests, in milliseconds decimal notation calculations.
  • % ^ Ignore this field.
  • % ~ Continue to parse the log string until it finds a non-null character (! Isspace).
  • Host ~ h X-Forwarded-For (XFF) field (the client IP address, IPv4 or IPv6).

  • Modify the configuration file

# vim /usr/local/goaccess/etc/goaccess/goaccess.conf
time-format %H:%M:%S
date-format %d/%b/%Y
log-format %h %^[%d:%t %^] "%r" %s %b "%R" "%u"

Exemplified herein used to analyze nginxthe log, in order to analyze the accuracy, arranged about nginxthe log_formatcommon format

    log_format  main '$remote_addr - $remote_user [$time_local] "$request" '
                     '$status $body_bytes_sent "$http_cookie" "$http_referer" '
                     '"$http_user_agent" "$http_x_forwarded_for"';

7. Use

Common Parameter Description

  • -a --agent-list enable the agent by the host user list. For faster resolution, do not enable the
  • -d --with-output-resolver in turn IP parsing HTML / JSON output, IP will be used to resolve GeoIP
  • Log File Path -f --log-file to be analyzed
  • -p --config-file configuration file path
  • -o --output output format, supports html, json, csv
  • -m --with-mouse click on the control panel support
  • Parameter section -q --no-query-string request ignored
  • --real-time-html generate HTML reports in real time
  • --daemonize daemon mode - use real-time-html

7.1, console mode

# pwd
/var/log/nginx
# goaccess -a -d -f nginx_access.log-20191216 -p /usr/local/goaccess/etc/goaccess/goaccess.conf

Console operating method of the key

  • F1 Main Help Page
  • F5 to redraw the main window
  • q quit
  • 1-15 jump to a position corresponding to the module number
  • o Open detailed view of the current module
  • j current module Scroll down
  • Scroll up current module k
  • s sorting module
  • / Search matches in all modules
  • N Find a location that appears next
  • g moves to a top of the first module
  • G moves to the bottom of the last module

Results are as follows

7.2, HTML mode

# pwd
/var/log/nginx
# goaccess -a -d -f nginx_access.log-20191216 -p /usr/local/goaccess/etc/goaccess/goaccess.conf -o /usr/local/nginx/html/www.ssgeek.com/go-access.html
Parsing... [9,692] [692/s]s]

Generated here htmldirected directly to the next site directory, so the analysis and generation htmlcan access the file, not much to say here on the big picture!

7.3 daemon mode

Use daemonizemode, generate real-time HTMLreporting process and the process of creating a static report is very similar, just after starting to add commands --real-time-htmland --daemonizeparameters can be

# goaccess -a -d -f nginx_access.log-20191216 -p /usr/local/goaccess/etc/goaccess/goaccess.conf -o /usr/local/nginx/html/www.ssgeek.com/go-access.html --real-time-html --daemonize
Daemonized GoAccess: 13308

The default listening port is 7890, you can use --portthe designated port number will be prompted to start the WebSocketserver is ready to receive connections from clients

# netstat -lntup|grep 7890
tcp        0      0 0.0.0.0:7890            0.0.0.0:*               LISTEN      13333/goaccess

如果网站开启了HTTPS功能,就需要GoAccess启用openssl,在配置文件goaccess.conf中配置ssl-certssl-key来支持openssl,还需要将ws-url指定为https的域名

7.4、定时更新

也可以通过定时任务的方式实现定时更新HTML报表,每30分钟刷新一次

# crontab -e
*/30 * * * * goaccess -a -d -f nginx_access.log-20191216 -p /usr/local/goaccess/etc/goaccess/goaccess.conf -o /usr/local/nginx/html/www.ssgeek.com/go-access.html

7.5、处理不断增长的日志

处理不断增长的日志指的是当日志按照某些规则进行切割,例如nginx日志每天切割一份的情况。这个时候GoAccess通过磁盘B+树数据库能够处理不断增长的日志。工作原理如下:

  • 首先数据集必须使用 --keep-db-files, 参数保存,然后相同的数据集可以使用参数 --load-from-disk 载入。
  • 收到新的数据(来自管道或者文件)后,将会被附加到原始数据集上。
  • 在任何时候都保存数据, 则必须使用 --keep-db-files 参数。
  • 如果在使用参数 --load-from-disk 时没有同时使用 --keep-db-files 参数,则数据库文件在程序关闭时将会被删除。

示例:

// 上个月的访问日志
goaccess access.log.1 --keep-db-files

然后,载入

// 添加这个月的新日志,并保存为新数据
goaccess access.log --load-from-disk --keep-db-files

读取已经保存的数据(不解析新数据)

goaccess --load-from-disk --keep-db-files

7.6、其他使用

  • 生成 JSON 报告
# goaccess -a -d -f nginx_access.log-20191216 -p /usr/local/goaccess/etc/goaccess/goaccess.conf -o json > report.json
  • 生成 CSV 文件
# goaccess -a -d -f nginx_access.log-20191216 -p /usr/local/goaccess/etc/goaccess/goaccess.conf -o csv > report.csv
  • 实施过滤和解析
# tail -f nginx_access.log | goaccess -p /usr/local/goaccess/etc/goaccess/goaccess.conf -
  • 分析多个文件
# goaccess -p /usr/local/goaccess/etc/goaccess/goaccess.conf access.log.1 access.log.2

参考:
https://goaccess.cc/
https://github.com/allinurl/goaccess

Guess you like

Origin www.cnblogs.com/ssgeek/p/12114667.html
Recommended