Haproxy builds web cluster (theory + actual deployment)

1. Haproxy application analysis

1.1, common Web cluster scheduler

At present, common Web cluster schedulers are divided into software and hardware. The
software usually uses open source LVS, Haproxy, and Nginx. The
hardware generally used is F5, and many people use some domestic products, such as Barracuda, NSFOCUS, etc.

1.2, Haproxy application analysis

(1) Although LVS has strong anti-load ability in enterprise applications, it has shortcomings

• LVS does not support regular processing and cannot achieve dynamic and static separation.
• For large websites, the implementation and configuration of LVS are complicated and the maintenance cost is relatively high

(2) Haproxy is a software that can provide high availability, load balancing, and proxy based on TCP and HTTP applications

•Especially suitable for Web sites with heavy loads
•Run on the current hardware and support tens of thousands of concurrent connection requests

Two, Haproxy scheduling algorithm

Haproxy supports a variety of scheduling algorithms, and there are three most commonly used: RR (Round Robin), LC (Least Connections), SH (Source Hashing)

2.1、RR(Round Robin)

The RR algorithm is the simplest and most commonly used algorithm, namely round-robin scheduling

Example:
• There are three nodes A, B, C
• The first user access will be assigned to node A
• The second user access will be assigned to node B
• The third user access will be assigned to node C
• The fourth A user access continues to be assigned to node A, polling to allocate access requests to achieve load balancing

2.2、LC(Least Connections)

The LC algorithm is the minimum connection algorithm, which dynamically allocates front-end requests according to the number of back-end node connections

Example:
• There are three nodes A, B, and C, and the number of connections of each node is A:4, B:5, C:6.
• The first user connection request will be assigned to A, and the number of connections will become A : 5, B: 5, C: 6
• The second user request will continue to be allocated to A, and the number of connections will become A6, B: 5, C: 6; a new request will be allocated to B each time The new request is assigned to the client with the smallest
number of connections. Since the number of connections of A, B, and C will be dynamically released in actual situations, it is difficult to have the same number of connections
. Therefore, this algorithm is greatly improved compared to the rr algorithm. Is one of the most used algorithms

2.3、SH(Source Hashing)

SH is based on the source access scheduling algorithm. This algorithm is used in some scenarios where Session sessions are recorded on the server side. Cluster scheduling can be done based on the source IP, Cookie, etc.

Example:
• There are three nodes A, B, and C. The first user is assigned to A for the first visit, and the second user is assigned to B
for the first visit. • It will continue when the first user visits for the second time Assigned to A, the second user will still be assigned to B for the second visit. As long as the load balancing scheduler does not restart, the first user's access will be assigned to A, and the second user's access will be assigned to B. Realize cluster scheduling
•The advantage of this scheduling algorithm is to achieve session retention, but when some IP accesses are very large, it will cause unbalanced load, and some nodes have excessive access, which affects business use

Three, Haproxy cluster configuration

3.1. Experimental environment

VMware software
Two centos7.6 virtual machines as Nginx servers (IP address: 192.168.100.22; IP address: 192.168.100.23),
one centos7.6 virtual machine as Haproxy server (IP address: 192.168.100.21) and
one centos7.6 The virtual machine acts as a storage server (IP address: 192.168.100.24)

3.2, configure storage server

First, check whether nfs-utils and rpcbind are installed, if they are not installed with yum
, start the two services after installation

[root@localhost ~]# systemctl start nfs
[root@localhost ~]# systemctl start rpcbind
[root@localhost ~]# mkdir /opt/51xit /opt/52xit
[root@localhost ~]# vi /etc/exports
/opt/51xit 192.168.100.0/24(rw,sync)
/opt/52xit 192.168.100.0/24(rw,sync)
[root@localhost ~]# systemctl restart rpcbind
[root@localhost ~]# systemctl restart nfs
[root@localhost ~]# systemctl enable nfs
[root@localhost ~]# systemctl enable rpcbind
[root@localhost ~]# echo "this is www.51xit.top" > /opt/51xit/index.html
[root@localhost ~]# echo "this is www.52xit.top" > /opt/52xit/index.html

3.3, configure nginx server

3.3.1, compile and install Nginx

The Nginx installation file can be downloaded from the official website http://www.nginx.org/.
Let's take the stable version of Nginx 1.12.2 as an example and upload it to /opt

[root@localhost ~]#yum -y install pcre-devel zlib-devel gcc-c++
[root@localhost ~]# useradd -M -s /sbin/nologin nginx
[root@localhost ~]# cd /opt
[root@localhost ~]# tar zxvf nginx-1.12.2.tar.gz
[root@localhost ~]# cd nginx-1.12.2
[root@localhost nginx-1.12.2]# 
./configure \
--prefix=/usr/local/nginx \
--user=nginx \
--group=nginx

[root@localhost nginx-1.12.2]# make && make install
[root@localhost nginx-1.12.2]# ln -s /usr/local/nginx/sbin/nginx /usr/local/sbin/
[root@localhost nginx-1.12.2]# nginx -t
nginx: the configuration file /usr/local/nginx/conf/nginx.conf syntax is ok
nginx: configuration file /usr/local/nginx/conf/nginx.conf test is successful
[root@localhost nginx-1.12.2]# nginx
[root@localhost nginx-1.12.2]# netstat -anpt | grep nginx
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      25182/nginx: master 

Two nginx servers compile and install nginx services in the same steps

3.3.2, install httpd mount test page

192.168.100.22

[root@localhost ~]# showmount -e 192.168.100.24
Export list for 192.168.100.24:
/opt/52xit 192.168.100.0/24
/opt/51xit 192.168.100.0/24
[root@localhost ~]# mount 192.168.100.24:/opt/51xit /usr/local/nginx/html/
[root@localhost ~]# vi /etc/fstab 

#
# /etc/fstab
# Created by anaconda on Thu Aug  6 12:23:03 2020
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
/dev/mapper/centos-root /                       xfs     defaults        0 0
UUID=a1c935eb-f211-43a5-be35-2a9fef1f6a89 /boot                   xfs     defaults        0 0
/dev/mapper/centos-swap swap                    swap    defaults        0 0
/dev/cdrom /mnt iso9660 defaults 0 0
192.168.100.24:/opt/51xit/ /usr/local/nginx/html/ nfs defaults,_netdev 0 0
[root@localhost ~]# killall -1 nginx

Test whether the login is normal
Insert picture description here
192.168.100.23

[root@localhost ~]# showmount -e 192.168.100.24
Export list for 192.168.100.24:
/opt/52xit 192.168.100.0/24
/opt/51xit 192.168.100.0/24
[root@localhost ~]# mount 192.168.100.24:/opt/52xit /usr/local/nginx/html/
[root@localhost ~]# vi /etc/fstab 

#
# /etc/fstab
# Created by anaconda on Thu Aug  6 12:23:03 2020
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
/dev/mapper/centos-root /                       xfs     defaults        0 0
UUID=a1c935eb-f211-43a5-be35-2a9fef1f6a89 /boot                   xfs     defaults        0 0
/dev/mapper/centos-swap swap                    swap    defaults        0 0
/dev/cdrom /mnt iso9660 defaults 0 0
192.168.100.24:/opt/52xit/ /usr/local/nginx/html/ nfs defaults,_netdev 0 0
[root@localhost ~]# killall -1 nginx

Test whether the login is normal
Insert picture description here

3.4. Configure Haproxy server

3.4.1, compile and install Haproxy

上传 haproxy-1.4.24.tar.gz 到/opt目录下
[root@localhost ~]# yum -y install pcre-devel bzip2-devel gcc gcc-c++
[root@localhost ~]# cd /opt
[root@localhost opt]# tar xzvf haproxy-1.4.24.tar.gz 
[root@localhost opt]# cd haproxy-1.4.24/
[root@localhost haproxy-1.4.24]# make TARGET=linux26
[root@localhost haproxy-1.4.24]# make install

3.4.2. Configure Haproxy service

[root@localhost haproxy-1.4.24]# mkdir /etc/haproxy
[root@localhost haproxy-1.4.24]# cp examples/haproxy.cfg /etc/haproxy/
[root@localhost haproxy-1.4.24]# vi /etc/haproxy/haproxy.cfg 
global
        log 127.0.0.1   local0
        log 127.0.0.1   local1 notice
        #log loghost    local0 info
        maxconn 4096
        #chroot /usr/share/haproxy
        uid 99
        gid 99
        daemon
        #debug
        #quiet

defaults
        log     global
        mode    http
        option  httplog
        option  dontlognull
        retries 3
        #redispatch
        maxconn 2000
        contimeout      5000
        clitimeout      50000
        srvtimeout      50000

listen  webcluster 0.0.0.0:80
        option httpchk GET /index.html
        balance roundrobin
        server inst1 192.168.100.22:80 check inter 2000 fall 3
        server inst2 192.168.100.23:80 check inter 2000 fall 3
[root@localhost haproxy-1.4.24]# cp examples/haproxy.init /etc/init.d/haproxy
[root@localhost haproxy-1.4.24]# chmod 755 /etc/init.d/haproxy
[root@localhost haproxy-1.4.24]# chkconfig --add haproxy
[root@localhost haproxy-1.4.24]# ln -s /usr/local/sbin/haproxy /usr/sbin/haproxy
[root@localhost haproxy-1.4.24]# systemctl start haproxy.service

3.5, verification

Enter 192.168.100.21 in the real machine browser
Insert picture description here

Re-enter after a minute and
Insert picture description here
you will find different website pages indicating that load balance has been achieved.

3.6 Detailed explanation of Haproxy configuration file

1. The Haproxy configuration file is usually divided into three parts.
global: configure the global
defaults: configure the default
listen: configure the
global configuration parameter
log 127.0.0.1 for the application component local0: configure logging, configure logging, local0 is the log device, default Store in the system log
log 127.0.0.1 local1 notice: notice is the log level, usually there are 24 levels
maxconn 4096: the maximum number of connections
uid 99: user uid gid 99: user gid

2. The defaults configuration item configuration default parameters will generally be inherited by the application components. If there is no special statement in the application components, the default configuration parameters will be installed.
Log global: define the log as the log definition in the global configuration
mode http: the mode is http
option httplog: use http log format to record the log
option dontlognull: ensure that HAProxy does not record heartbeat packets sent by the upper-level load balancer to detect the state without data
retries 3: check the node server failure for three consecutive times, the node is considered unavailable
maxconn 2000: maximum connection Number
contimeout 5000: connection timeout time
clitimeout 50000: client timeout time
srvtimeout 50000: server timeout time

3. The listen configuration item is generally to configure the application module parameter
listen appli4-backup 0.0.0.0:10004: define an appli4-backup application
option httpchk /index.html: check the index.html file of the server
option persist: force the request to be sent Servers that have been
downed balance roundrobin: load balancing scheduling algorithm uses polling algorithm
server inst1 192.168.114.56:80 check inter 2000 fall 3: define online node
server inst2 192.168.114.56:81 check inter 2000 fall 3 backup: define backup node

Four, Haproxy log management

Haproxy's log is output to the syslog of the system by default, which is generally defined separately in the production environment

Definition method steps:
modify the log configuration options in the Haproxy configuration file, add the configuration:
log /dev/log local0 info
log /dev/log local0 notice
modify the rsyslog configuration, define the Haproxy related configuration to
haproxy.conf independently , and
Put it under /etc/rsyslog.d/ Save the configuration file and restart the rsyslog service to complete the rsyslog configuration

[root@localhost haproxy-1.4.24]# vi /etc/haproxy/haproxy.cfg	'//编辑haproxy配置文件'
# this config needs haproxy-1.1.28 or haproxy-1.2.1

global
        log /dev/log    local0 info	
        log /dev/log    local1 notice
    ...省略内容
[root@localhost haproxy-1.4.24]# systemctl restart haproxy.service	'//重启haproxy服务'
[root@localhost haproxy-1.4.24]# touch /etc/rsyslog.d/haproxy.conf	'//创建一个新haproxy配置文件'
[root@localhost haproxy-1.4.24]# vi /etc/rsyslog.d/haproxy.conf 	'//编写haproxy配置文件脚本'
if ($programname ==  'haproxy' and $syslogseverity-text == 'info')
then -/var/log/haproxy/haproxy-info.log
&~
if ($programname ==  'haproxy' and $syslogseverity-text == 'notice')
then -/var/log/haproxy/haproxy-notice.log
&~
[root@localhost haproxy-1.4.24]# vi /etc/rsyslog.d/haproxy.conf systemctl restart rsyslog.service 	'//重启日志服务'

Visit the Haproxy cluster test webpage and test the log information

'//未访问网页,查看/var/log'
[root@localhost ~]# cd /var/log
[root@localhost log]# ls
发现没有haproxy文件
'//查看网页之后,再次查看/var/log'
[root@localhost log]# ls
已经生成haproxy文件了,可以进去查看
[root@localhost log]# cd haproxy/
[root@localhost haproxy]# ls
haproxy-info.log
[root@localhost haproxy]# cat haproxy-info.log 

Five, Haproxy parameter optimization

As the load on the corporate website increases, haproxy parameter optimization is very important

•Maxconn: the maximum number of connections, adjusted according to the actual situation of the application, it is recommended to use 10 240

•Daemon: daemon process mode, Haproxy can be started in non-daemon process mode, it is recommended to use daemon process mode to start

• nbproc: the number of concurrent processes for load balancing, it is recommended to be equal to or twice the number of CPU cores of the current server.
retries: the number of retries, mainly used to check cluster nodes. If there are many nodes and the amount of concurrency is large, set to 2 Times or 3 times

•Option http-server-close: actively close the http request option, it is recommended to use this option in a production environment

•Timeout http-keep-alive: long connection timeout time, set the long connection timeout time, which can be set to 10s

•Timeout http-request: http request timeout time, it is recommended to set this time to 5~10s to increase the http connection release speed

•Timeout client: client timeout time. If the traffic is too large and the node response is slow, you can set this time shorter. It is recommended to set it to about 1min.

Guess you like

Origin blog.csdn.net/weixin_48191211/article/details/108776372