Use haproxy+nginx to build a web cluster and Haproxy log management (for high concurrency)

Preface

Common web cluster scheduler

At present, common Web cluster schedulers are divided into software and hardware. The software usually uses open source LVS, Haproxy, and Nginx. The hardware generally uses F5. Many people use some domestic products, such as Barracuda, NSFOCUS, etc.

Although LVS has strong anti-load ability in enterprise applications, it has shortcomings

LVS does not support regular processing, and cannot achieve dynamic and static separation.
For large websites, the implementation and configuration of LVS are complicated and the maintenance cost is relatively high.
Haproxy is a software that provides high availability, load balancing, and proxy based on TCP and HTTP applications.

Especially suitable for Web sites with a heavy load.
Running on the current hardware, it can support tens of thousands of concurrent connection requests.

One: Haproxy scheduling algorithm

Haproxy supports a variety of scheduling algorithms, and there are three most commonly used: RR (Round Robin), LC (Least Connections), SH (Source Hashing)

1.1:RR(Round Robin)

The RR algorithm is the simplest and most commonly used algorithm, that is, round-robin scheduling.
Understanding for example,
there are three nodes A, B, and C. The first user access will be assigned to node A, and the second user access will be assigned to node B. , The third user's access will be assigned to node C. The
fourth user's access will continue to be assigned to node A, polling and assigning access requests to achieve load balancing

1.2:LC(Least Connections)

The LC algorithm is the minimum number of connections algorithm, which dynamically allocates front-end requests according to the number of connections of the back-end nodes.
Understand that
there are three nodes A, B, and C, and the number of connections of each node are A: 4, B: 5, and C: 6, this If there is the first user connection request, it will be assigned to A, the number of connections becomes A: 5, B: 5, C: 6 The
second user request will continue to be allocated to A, the number of connections becomes A6, B: 5, C: 6; A new request will be assigned to B, and each time a new request will be assigned to the client with the smallest number of connections.
Because of the actual situation, the number of connections of A, B, and C will be dynamically released, which is difficult There will be the same number of connections, so this algorithm is greatly improved compared to the rr algorithm. It is an algorithm that is currently used more.
1.3: SH (Source Hashing)
SH is based on the source access scheduling algorithm. This algorithm is used for some There is a scenario where the Session session is recorded on the server side, and cluster scheduling can be done based on the source IP, Cookie, etc.
Understanding for example,
there are three nodes A, B, and C. The first user is assigned to A for the first visit, and the second user is the first Access is assigned to B.
When the first user visits for the second time, it will continue to be assigned to A, and the second user will still be assigned to B for the second visit. As long as the load balancing scheduler does not restart, the first user Access will be assigned to A, and the second user access will be assigned to B. To achieve cluster scheduling,
the advantage of this scheduling algorithm is to achieve session retention, but when some IP accesses are very large, it will cause unbalanced load and some node access. Oversized, affecting business use

Two: Haproxy cluster construction

2.1: Environmental preparation

VMware软件
两台centos7虚拟机作为NGINX   
IP地址:192.168.100.3    PC-3
IP地址:192.168.100.4     pc -4
一台centos7虚拟机作为Haproxy(IP地址:192.168.100.20)     pc-2

2.2: Installation and startup of Nginx

Install Nginx on the two web servers and start the service

Install environment software

yum install -y  gcc  gcc-c++   make  pcre-devel  expat-devel  perl  bzip2 zlib-devel  pcre
useradd -M -s /sbin/nologin nginx     // 创建程序账户
 [root@pc-3 opt]# tar zxvf nginx-1.12.2.tar.gz
[root@pc-3 opt]# cd nginx-1.12.2/
[root@pc-4 nginx-1.12.2]# ./configure \
> --prefix=/usr/local/nginx\
> --user=nginx\
> --group=nginx


[root@pc-4 nginx-1.12.2]# make && make install

Set home page

[root@pc-4 nginx-1.12.2]# cd /usr/local/nginx/html/
[root@pc-4 html]# echo "this is  monkey" > test.html
[root@pc-4 html]# ln -s /usr/local/nginx/sbin/nginx /usr/local/sbin/


PC3操作和PC4之前都一样
只不过这步不一样
[root@pc-3 html]# echo "this is  yellowdog" > test.html

2.3 Operation configuration of PC1

2.3.1 Install haproxy

[root@pc-2 opt]# yum install -y \
> pcre-devel \
> bzip2-devel \     //开启压缩功能
> gcc \
> gcc-c++ \
> make
tar zxvf haproxy-1.5.19.tar.gz
[root@pc-2 opt]# cd haproxy-1.5.19/
[root@pc-2 haproxy-1.5.19]# make TARGET=linux26
[root@pc-2 haproxy-1.5.19]# make install

2.3.2 Configure haproxy

[root@pc-2 examples]# cp haproxy.cfg /etc/haproxy/
[root@pc-2 examples]# cd /etc/haproxy
[root@pc-2 haproxy]# vim haproxy.cfg

Insert picture description here

global
        log 127.0.0.1   local0
        log 127.0.0.1   local1 notice
        #log loghost    local0 info
        maxconn 4096
        #chroot /usr/share/haproxy        // 取消禁锢目录
        uid 99
        gid 99
        daemon
        #debug
        #quiet

defaults
        log     global
        mode    http
        option  httplog
        option  dontlognull
        retries 3
        #redispatch     //  取消对宕机的设备发送任务
        maxconn 2000
        contimeout      5000
        clitimeout      50000
        srvtimeout      50000

listen   webcluster 0.0.0.0:80
           option httpchk GET /test.html
           balance roundrobin
           server inst1 192.168.100.3:80 check inter 2000 fall 3
           server inst2 192.168.100.4:80 check inter 2000 fall 3

2.4: Detailed explanation of Haproxy configuration file

Haproxy配置文件通常分为三个部分
global:为全局配置
defaults:为默认配置
listen:为应用组件配置
global配置参数
log127.0.0.1 lcal0:配置日志记录,local0为日志设备,默认存放到系统日志
log127.0.0.1 loca1 notice:notice为日志级别,通常有24个级别
maxconn4096:最大连接数
uid 99:用户uid
gid 99:用户gid
defaults配置项配置默认参数,一般会被应用组件继承,如果在应用组件中没有特别声明,将安装默认配置参数设置
log global:定义日志为global配置中的日志定义
mode http:模式为http
option httplog:采用http日志格式记录日志
retries 3:检查节点服务器失败连续达到三次则认为节点不可用
maxconn2000:最大连接数
contimeout5000:连接超时时间
clitimeout50000:客户端超时时间
srvtimeout50000:服务器超时时间
listen配置项目一般为配置应用模块参数
listen appli4- backup 0.0.0.0:10004:定义一个appli4- backup的应用
option httpchk /index.html检查服务器的index.html文件
option persist:强制将请求发送到已经down掉的服务器
balance roundrobin:负载均衡调度算法使用轮询算法
server inst1 192.168.100.3:80 check inter 2000 fall 3:定义节点1
server inst2 192.168 100.4:80 check inter 2000 fall 3 backup:定义节点2

2.5 Open the nginx service of the two nodes

[root@pc-3 nginx-1.12.2]# nginx
[root@pc-3 nginx-1.12.2]# netstat -ntap | grep 80
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      22646/nginx: master

2.6 Access 192.168.100.20/test.html test on the client

Insert picture description here

Three: Haproxy log management

Haproxy's log is output to the syslog of the system by default, which is generally defined separately in the production environment

Defined method steps

3.1 Modify the log configuration options in the Haproxy configuration file and add the configuration:

log /dev/log local0 info
log /dev/log local0 notice
修改 rsyslog配置,将 Haproxy相关的配置独立定义到
haproxy.conf,并放到/etc/rsyslog.d/下
保存配置文件并重启 rsyslog服务,完成 rsyslog配置
vim /etc/haproxy/haproxy.cfg

Insert picture description here

global
        log /dev/log   local0 info
        log /dev/log   local0 notice
        #log loghost    local0 info
        maxconn 4096
        #chroot /usr/share/haproxy

3.2 Restart the service and add configuration files

root@pc-2 /]# service haproxy  restart
Restarting haproxy (via systemctl):                        [  确定  ]
[root@pc-2 /]# touch /etc/rsyslog.d/haproxy.conf
[root@pc-2 /]# vim /etc/rsyslog.d/haproxy.conf
if ($programname == 'haproxy' and $syslogseverity-text =='info' )
then -/var/log/haproxy/haproxy-info.log
&~
if ($programname == 'haproxy' and $syslogseverity-text == 'notice' )
then -/var/log/haproxy/haproxy-notice.log
&~
~
~

~

systemctl restart rsyslog.service

3.3 Visit the browser to view the generated log file

[root@pc-2 log]# systemctl restart rsyslog.service
[root@pc-2 log]# cd /var/log

Insert picture description here

Four: Detailed explanation of the parameters that can be optimized by Haproxy

With the increase of corporate website load,
haproxy parameter optimization is very important maxconn: the maximum number of connections, adjusted according to the actual situation of the application, it is recommended to use 10 240
daemon: daemon mode, Haproxy can be started in non-daemon mode, it is recommended to use daemon mode to start
nbproc: The number of concurrent processes for load balancing. It is recommended to be equal to or twice the number of CPU cores of the current server.
retries: The number of retries, mainly used to check cluster nodes. If there are many nodes and the amount of concurrency is large, set to 2 times Or 3 times
option http-server-close: actively close the http request option, it is recommended to use this option in the production environment
timeout http-keep-alive: long connection timeout time, set the long connection timeout time, can be set to 10s
timeout http-request : Http request timeout time, it is recommended to set this time to 5~10s to increase the http connection release speed.
timeout client: client timeout time, if the traffic is too large and the node response is slow, you can set this time shorter, it is recommended to set it to About 1min is fine

Guess you like

Origin blog.csdn.net/BIGmustang/article/details/108369418