HAProxy+nginx builds a load balancing cluster

Table of contents

1. Common Web cluster schedulers

2. Introduction to HAProxy cluster 

1. The characteristics of Haproxy:

2. Scheduling algorithm commonly used by Haproxy 

① Round Robin

② Minimum number of connections (Least Connections)

③ Source-based access scheduling algorithm (Source Hashing)

④URi

⑤ url_param

⑥ rdp-cookie(name)

⑦ source

⑧ static -rr

3. What is the difference between nginx, LVS and Haproxy

 3. Experiment:

1. Haproxy builds a Web cluster

2. Haproxy server deployment:

3. Haproxy server configuration

4. Add haproxy system service

 5. Node server deployment:

Another operation is the same as above

 --192.168.77.27---

  --192.168.77.28---

 6. Create a soft connection (the two commands are the same)

 7. Start nginx

 8. Test Web Cluster

refresh:

 8. Log definition (configured on the server where HAProxy is installed)


1. Common Web cluster schedulers

Web cluster scheduler is divided into software and hardware

  • LVS, Haproxy , Nginx commonly used by software

  • LVS has the best performance, but it is relatively complicated to build. The upstream module of Nginx supports cluster functions, but the health check function of cluster nodes is not strong, and high concurrency is not as good as Haproxy.

  • F5 is generally used for hardware, and many people use domestic products such as Barracuda and NSFOCUS.

2. Introduction to HAProxy cluster 

LVS has a strong ability to resist loads in enterprises, but it has shortcomings.

LVS does not support regular processing and cannot achieve dynamic and static separation. For large websites, the implementation and configuration of LVS are complex and the maintenance cost is relatively high

Haproxy is a software that can improve availability, load balancing, and proxy based on TCP and HTTP applications. It is suitable for heavily loaded Web sites and can support tens of thousands of concurrent connection requests running on hardware.

1. The characteristics of Haproxy:

1. The reliability and stability are very good, comparable to the hardware-level F5 load balancer equipment.

2. Up to 40,000-50,000 concurrent connections can be maintained at the same time, the maximum number of requests processed per unit time is 20,000, and the maximum processing capacity can reach 10Git/t

3. Support up to 8 load balancing algorithms, and also support session persistence

4. Support unique functions such as connection rejection and fully transparent proxy

5. Support virtual host function, so as to achieve more flexible web load balancing

6. Strong ACL support for access control

7. Its unique elastic binary tree data structure increases the complexity of the data structure to 0 (1), that is, the query speed of data will not decrease with the increase of data entries

8. Support the keepalive function of the client, reduce the waste of resources caused by the multiple three-way handshake between the client and haproxy, and allow multiple requests to be completed in one tcp connection

9. Support TCP acceleration, zero copy function, similar to mmap mechanism

10. Support response buffering

11. Support RDP protocol

12. Source-based stickiness, similar to nginx's ip_hash function, always dispatches requests from the same client to the same upstream server within a certain period of time

13. Better statistical data interface, its web interface displays the statistical information of data acceptance, transmission, rejection, error and other data of each server in the backend Jiquanzhong

14. Detailed health status detection, related to the health detection status of the upstream server in the web interface, and provides certain management functions

15. Traffic-based health assessment mechanism

16. Based on http authentication

17. Command line based management interface

18. Log analyzer, which can analyze logs
 

2. Scheduling algorithm commonly used by Haproxy 

Haproxy supports a variety of scheduling algorithms, the most commonly used are 8

① Round Robin

  • The RR algorithm is the simplest and most commonly used algorithm, polling and assigning access requests to achieve load balancing effects

② Minimum number of connections (Least Connections)

  • The minimum connection number algorithm dynamically allocates front-end requests according to the number of thick node connections. Compared with the rr algorithm, this algorithm is greatly improved, and it is an algorithm that is used more

③ Source-based access scheduling algorithm (Source Hashing)

  • It is used in some scenarios where session records are recorded on the server side, and cluster scheduling can be done based on the source IP, cookie, etc.
  • Example: When the server has three nodes, when the first user accesses node A, the second user accesses node B. If the first user downloads and visits again, he will still be assigned to node A, and the second user will also be assigned to node B. As long as the load balancer does not restart, it will always be assigned in this way.
  • The advantage of the sub-scheduling algorithm is to achieve session retention, but when some IP accesses are very large, it will cause uneven load balancing, and some nodes have large access volumes, which will affect business usage.

④URi

  • Indicates that according to the requested URI, CDN needs to be used

⑤ url_param

  • Indicates that each HTTP request is locked according to the HTTP request header

⑥ rdp-cookie(name)

  • Indicates that each TCP request is locked and hashed according to the cookie (name)

⑦ source

  • Indicates that according to the source IP of the request, it is similar to Nginx's IP hash mechanism

⑧ static -rr

  • Indicates that according to the weight, round-robin allocation

3. What is the difference between nginx, LVS and Haproxy

LVS

  • Soft load balancing based on third-party applications
  • It can only realize the IP load balancing technology of layer 4, and the status monitoring function is single, but the overall load balancing performance is the strongest

nginx

  • Soft load balancing based on third-party applications

  • Can implement 4-layer and 7-layer technology

  • It is mainly used for web servers or cache servers. Although the upstream module of nginx also supports cluster functions, it does not have a strong health check function for cluster nodes, and its performance is not as good as that of Haproxy.

Haproxy

  • Realize soft load balancing based on linux operating system kernel
  • It can provide a comprehensive load balancing solution for TCP and HTTP applications
  • In terms of status monitoring, the functions are richer and more powerful, and can support various status monitoring methods such as ports, URLs, and scripts.

Summarize:

Summarized from three aspects, ① based on the system kernel or third-party applications, ② working on layer 4 or layer 7, ③ monitoring status

  • LVS implements soft load balancing based on the operating system kernel, and nginx and haproxy are implemented based on third-party applications.
  • LVS can implement 4-layer ip load balancing technology, LVS load balancing is the lightest in 4-layer, nginx and haproxy can both realize 4-layer and 7-layer
  • LVS has a single state monitoring function, Haproxy has powerful state monitoring functions, and can support port, URL, and script state monitoring. Nginx is mainly a web server or cache server. Although there are also uostream modules that support clustering functions, the health check of nodes is not strong.

 3. Experiment:

1. Haproxy builds a Web cluster

server ip address
Haproxy server 192.168.77.26
nginx1 server 192.168.77.27
nginx2 server 192.168.77.28
client Access it locally

2. Haproxy server deployment:

1. Turn off the firewall, and transfer the software package required to install Haproxy to the /opt directory
systemctl stop firewalld
setenforce 0

 2. Compile and install Haproxy

  • yum install -y pcre-devel bzip2-devel gcc gcc-c++ make #Install dependent environment
  • tar zxvf haproxy-1.5.19.tar.gz #Decompression
  • cd haproxy-1.5.19/
  • make TARGET=linux2628 ARCH=x86_64 #TARGET=linux26 #kernel version,
  • #Use uname -r to view the kernel, such as: 2.6.18-371.el5, then use TARGET=linux26 for this parameter; use TARGET=linux2628 if the kernel is greater than 2.6.28
  • make install #installation

 

3. Haproxy server configuration

mkdir /etc/haproxy
cp examples/haproxy.cfg /etc/haproxy/

cd /etc/haproxy/
vim haproxy.cfg
global		#全局配置,主要用于定义全局参数,属于进程级的配置,通常和操作系统配置有关
--4~5行--修改,定义haproxy日志输出设置和日志级别,local0为日志设备,默认存放到系统日志
		log /dev/log   local0 info		#修改
        log /dev/log   local0 notice	#修改
        #log loghost    local0 info
        maxconn 4096			#最大连接数,需考虑ulimit -n限制,推荐使用10240
--8行--注释,chroot运行路径,为该服务自设置的根目录,一般需将此行注释掉
        #chroot /usr/share/haproxy
        uid 99					#用户UID
        gid 99					#用户GID
        daemon					#守护进程模式
		nbproc 1				#添加,设置并发进程数,建议与当前服务器CPU核数相等或为其2倍

defaults   	#配置默认参数,这些参数可以被用到Listen,frontend,backend组件     
		log     global			#引入global定义的日志格式
        mode    http			#模式为http(7层代理http,4层代理tcp)
        option  httplog			#日志类别为http日志格式
        option  dontlognull		#不记录健康检查日志信息
        retries 3				#检查节点服务器失败次数,连续达到三次失败,则认为节点不可用
        redispatch				#当服务器负载很高时,自动结束当前队列处理比较久的连接
        maxconn 2000			#最大连接数,“defaults”中的值不能超过“global”段中的定义
        #contimeout 5000        #设置连接超时时间,默认单位是毫秒
        #clitimeout 50000       #设置客户端超时时间,默认单位是毫秒
        #srvtimeout 50000       #设置服务器超时时间,默认单位是毫秒
        timeout http-request 10s 	#默认http请求超时时间
        timeout queue 1m   		#默认队列超时时间
        timeout connect 10s		#默认连接超时时间,新版本中替代contimeout,该参数向后兼容
        timeout client 1m		#默认客户端超时时间,新版本中替代clitimeout,该参数向后兼容
        timeout server 1m		#默认服务器超时时间,新版本中替代srvtimeout,该参数向后兼容
        timeout http-keep-alive 10s		#默认持久连接超时时间
        timeout check 10s		#设置心跳检查超时时间


--删除下面所有listen项--,添加
listen  webcluster 0.0.0.0:80	#haproxy实例状态监控部分配置,定义一个名为webcluster的应用
        option httpchk GET /test.html	#检查服务器的test.html文件
        balance roundrobin				#负载均衡调度算法使用轮询算法roundrobin
        server inst1 192.168.77.27:80 check inter 2000 fall 3		#定义在线节点
        server inst2 192.168.77.28:80 check inter 2000 fall 3

 


4. Add haproxy system service

cp /opt/haproxy-1.5.19/examples/haproxy.init /etc/init.d/haproxy
chmod +x haproxy
chkconfig --add /etc/init.d/haproxy

ln -s /usr/local/sbin/haproxy /usr/sbin/haproxy
service haproxy start	或	/etc/init.d/haproxy start

 

 5. Node server deployment:

systemctl stop firewalld
setenforce 0

yum install -y pcre-devel zlib-devel gcc gcc-c++ make 

useradd -M -s /sbin/nologin nginx

cd /opt
tar zxvf nginx-1.22.0.tar.gz -C /opt/

cd nginx-1.22.0/
./configure --prefix=/usr/local/nginx --user=nginx --group=nginx && make && make install

make && make install

--192.168.77.27---
echo "this is kgc web" > /usr/local/nginx/html/test.html

--192.168.77.28---
echo "this is benet web" > /usr/local/nginx/html/test.html

ln -s /usr/local/nginx/sbin/nginx /usr/local/sbin/

nginx      #启动nginx 服务

Another operation is the same as above

 --192.168.77.27---

  --192.168.77.28---

 6. Create a soft connection (the two commands are the same)

 7. Start nginx

 8. Test Web Cluster

Open http://192.168.77.26/test.html with a browser on the client, and constantly refresh the browser to test the load balancing effect

refresh:

 

 8. Log definition (configured on the server where HAProxy is installed)

vim /etc/haproxy/haproxy.cfg
global
	log /dev/log local0 info
	log /dev/log local0 notice

service haproxy restart   #重启服务

#需要修改rsyslog配置,为了便于管理。将haproxy相关的配置独立定义到haproxy.conf,并放到/etc/rsyslog.d/下,rsyslog启动时会自动加载此目录下的所有配置文件。
vim /etc/rsyslog.d/haproxy.conf
if ($programname == 'haproxy' and $syslogseverity-text == 'info')
then -/var/log/haproxy/haproxy-info.log
&~
if ($programname == 'haproxy' and $syslogseverity-text == 'notice')
then -/var/log/haproxy/haproxy-notice.log
&~

#说明:
这部分配置是将haproxy的info日志记录到/var/log/haproxy/haproxy-info.log下,将notice日志记录到/var/log/haproxy/haproxy-notice.log下。“&~”表示当日志写入到日志文件后,rsyslog停止处理这个信息。

systemctl restart rsyslog.service

tail -f /var/log/haproxy/haproxy-info.log		#查看haproxy的访问请求日志信息

 View the split logs:

 The error reported here is that we need to refresh 192.168.77.26/test.html with the client:

Guess you like

Origin blog.csdn.net/ZWH9991/article/details/132422461