consul 服务发现系统设计

consul服务发现系统设计:

前言: 一个系统中的服务,随着业务变得复杂,服务数量变得越来越多。 用配置文件更新服务变得不可靠,不实时。 于是有了专业的服务发现系统,如etcd,consul,zookeeper等组件,服务向集群注册服务,客户端从集群获取服务(对应的ip:port:router或者uri)。这样客户端就无需关心服务端的节点动态变化,服务发现系统还可以实现负载均衡,服务宕机后的无缝迁移,实现服务的高可用。 consul与etcd一样采用了raft算法(数据一致性算法)和gossip算法(后面我会专门写一篇文章介绍该算法,是libp2p或bitcoin的数据传播算法之一)应用示例: 例如mongodb副本集+consul实现高可用高性能的nosql数据库,例如etcd+redis或consul+filecoin 实现高可用高性能的链集群。

架构示意图:

3个server,3个client,每个客户端部署一个web服务。
consul的server的数据会持久化到磁盘文件,client的不会持久化到文件,其他的功能并没有不同
在这里插入图片描述

1、环境准备:

系统:ubuntu18.04

节点:

# 部署consul server
192.168.1.47
192.168.1.48
192.168.1.49

# 部署consul client、mongodb
192.168.1.100
192.168.1.101
192.168.1.102

2、安装服务:

在6台节点全部执行以下命令:

1安装consul apt源:

ubuntu/Debian 添加apt源仓库,安装consul:

curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -
sudo apt-add-repository "deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main"

安装consul:

apt-get update 
apt-get install consul -y

2安装mongodb:

apt install mongodb-server -y

3、搭建consul集群:

1、在server和client节点执行以下命令(创建mongodb的数据目录):

mkdir -p /data/consul_data

2、启动server节点:

在consul server1节点(192.168.1.47)执行:

nohup consul agent -bootstrap-expect 2 -server  -data-dir /data/consul0 -node=server1 -bind=192.168.1.47 -config-dir /etc/consul.d -enable-script-checks=true -datacenter=dc1 /data/consul0/consul.log 2>&1 &

在consul server2节点(192.168.1.48)执行:

nohup consul agent  -server -data-dir /data/consul0 -node=server2 -bind=192.168.1.48 -config-dir /etc/consul.d -enable-script-checks=true -datacenter=dc1 -join 192.168.1.47 /data/consul0/consul.log 2>&1 &

在consul server3节点(192.168.1.49)执行:

nohup consul agent -server -data-dir /data/consul0 -node=server3 -bind=192.168.1.49 -config-dir /etc/consul.d -enable-script-checks=true -datacenter=dc1 -join 192.168.1.47 /data/consul0/consul.log 2>&1 &

3、启动client节点:

在consul server3这个节点、3个consul client节点启动web服务:

root@jacky-VirtualBox:~# cat /root/test/web/main.go 
package main

// web.go

import (
	"fmt"
	"io"
	"log"
	"net/http"
	"strconv"
)

var iCnt int = 0

func helloHandler(w http.ResponseWriter, r *http.Request) {
    
    
	iCnt++
	str := "Hello world ! friend(" + strconv.Itoa(iCnt) + ")"
	io.WriteString(w, str)
	fmt.Println(str)
}

func main() {
    
    
	ht := http.HandlerFunc(helloHandler)
	if ht != nil {
    
    
		http.Handle("/hello", ht)
	}
	err := http.ListenAndServe(":80", nil)
	if err != nil {
    
    
		log.Fatal("ListenAndserve:", err.Error())
	}
}
root@jacky-VirtualBox:~#
root@jacky-VirtualBox:~# cd /root/test/web/
root@jacky-VirtualBox:~# go build main.go
root@jacky-VirtualBox:~# nohup /root/test/web/web > /tmp/web.log 2>&1 &
root@jacky-VirtualBox:~# 
root@jacky-VirtualBox:~# ps -aux | grep web
root        1804  0.1  0.1 1003376 5244 pts/0    Sl   11:01   0:00 /root/test/web/web
root        1812  0.0  0.0  17672   724 pts/0    S+   11:01   0:00 grep --color=auto web
root@jacky-VirtualBox:~# 
root@jacky-VirtualBox:~# lsof -i:80
COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
web     1804 root    3u  IPv6  31934      0t0  TCP *:http (LISTEN)
root@jacky-VirtualBox:~# 

将web服务注册到consul集群(有http注册和配置文件注册2种方式,推荐用配置文件的方式):

#mkdir /etc/consul.d/
# 该路径下的所有*.json文件都对应着一个服务,consul在启动时会解析json文件注册为服务,并进行健康检查(可以看作是心跳)

vim 编辑配置文件/etc/consul.d/web.json

{
    
    
    "service":{
    
    
        "name":"web",
        "tags":[
            "rails"
        ],
        "port":80,
        "check":{
    
    
            "name":"ping",
            "script":"curl -s localhost:80",
            "interval":"3s"
        }
    }
}

在consul client1节点(192.168.1.100)执行:

nohup consul agent  -data-dir /data/consul0 -node=client1 -bind=192.168.1.100 -config-dir /etc/consul.d -enable-script-checks=true -datacenter=dc1 -join 192.168.1.47 /data/consul0/consul.log 2>&1 &

在consul client2节点(192.168.1.101)执行:

nohup consul agent  -data-dir /data/consul0 -node=client2 -bind=192.168.1.101 -config-dir /etc/consul.d -enable-script-checks=true -datacenter=dc1 -join 192.168.1.47 /data/consul0/consul.log 2>&1 &

在consul client3节点(192.168.1.102)执行:

nohup consul agent  -data-dir /data/consul0 -node=client3 -bind=192.168.1.102 -config-dir /etc/consul.d -enable-script-checks=true -datacenter=dc1 -join 192.168.1.47 /data/consul0/consul.log 2>&1 &

注解:

其中:nohup:no hang up,意思是不挂断。表示永久执行命令,哪怕当前终端已经退出登录
&:后台执行命令。

2>&1:

在bash shell中,

0代表标准输入,一般是键盘录入;

1代表标准输出,一般是屏幕;

2代表标准错误;

因此当命令使用nohup &运行以后,标准都错误都输出到2去了,console上看不到输出的错误。

因此,2>&1,起到了一个重定向都作用,将标准错误重定向到标准输出上去,后台运行的程序就可以在屏幕上看到程序输出的错误。

查看日志:

tail -f /data/consul0/consul.log

4查看consul集群信息:

在consul client3节点(192.168.1.102)执行命令:

root@jacky-VirtualBox:~# consul info
agent:
	check_monitors = 1
	check_ttls = 0
	checks = 1
	services = 1
build:
	prerelease = 
	revision = 27de64da
	version = 1.10.0
consul:
	acl = disabled
	known_servers = 3
	server = false
runtime:
	arch = amd64
	cpu_count = 1
	goroutines = 43
	max_procs = 1
	os = linux
	version = go1.16.5
serf_lan:
	coordinate_resets = 0
	encrypted = false
	event_queue = 0
	event_time = 2
	failed = 0
	health_score = 0
	intent_queue = 0
	left = 0
	member_time = 6
	members = 4
	query_queue = 0
	query_time = 3
root@jacky-VirtualBox:~#
root@jacky-VirtualBox:~# consul members
Node  Address             Status  Type    Build   Protocol  DC   Segment
server1   192.168.1.47:8301  alive   server  1.10.0  2         dc1  <all>
server2   192.168.1.48:8301  alive   server  1.10.0  2         dc1  <all>
server3   192.168.1.49:8301  alive   server  1.10.0  2         dc1  <all>
client1   192.168.1.100:8301  alive   client  1.10.0  2         dc1  <default>
client2   192.168.1.101:8301  alive   client  1.10.0  2         dc1  <default>
client3   192.168.1.102:8301  alive   client  1.10.0  2         dc1  <default>
root@jacky-VirtualBox:~#

5、查看服务信息:

root@jacky-VirtualBox:~# dig @127.0.0.1 -p 8600 web.service.consul SRV

; <<>> DiG 9.16.1-Ubuntu <<>> @127.0.0.1 -p 8600 web.service.consul SRV
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46717
;; flags: qr aa rd; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 9
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;web.service.consul.		IN	SRV

;; ANSWER SECTION:
web.service.consul.	0	IN	SRV	1 1 80 client2.node.dc1.consul.
web.service.consul.	0	IN	SRV	1 1 80 client3.node.dc1.consul.
web.service.consul.	0	IN	SRV	1 1 80 client1.node.dc1.consul.
web.service.consul.	0	IN	SRV	1 1 80 server3.node.dc1.consul.

;; ADDITIONAL SECTION:
client2.node.dc1.consul.	0	IN	A	192.168.1.101
client2.node.dc1.consul.	0	IN	TXT	"consul-network-segment="
client3.node.dc1.consul.	0	IN	A	192.168.1.102
client3.node.dc1.consul.	0	IN	TXT	"consul-network-segment="
client1.node.dc1.consul.	0	IN	A	192.168.1.100
client1.node.dc1.consul.	0	IN	TXT	"consul-network-segment="
server3.node.dc1.consul.	0	IN	A	192.168.1.49
server3.node.dc1.consul.	0	IN	TXT	"consul-network-segment="

;; Query time: 3 msec
;; SERVER: 127.0.0.1#8600(127.0.0.1)
;; WHEN: 二 7月 1 21:6:41 CST 2021
;; MSG SIZE  rcvd: 411

root@jacky-VirtualBox:~#

6、consul服务重新加载:
consul的策略是启动时加载服务,运行期间并不会主动去扫描文件。因此,在手工增、删服务之后,需要给consul一个事件消息,使重新加载服务(重新扫描/etc/consul.d/目录下面的全部json文件,并以此为准去更新内存中的服务注册表)。 手工事件触发的这种设计思想非常合理。

root@jacky-VirtualBox:~# mv /etc/consul.d/web.json /tmp
root@jacky-VirtualBox:~# consul reload
Configuration reload triggered
root@jacky-VirtualBox:~# tail -f /data/consul0/consul.log
2021-07-14T11:16:04.954+0800 [WARN]  agent: Check is now critical: check=service:web
2021-07-14T11:16:14.993+0800 [WARN]  agent: Check is now critical: check=service:web
2021-07-14T11:16:25.042+0800 [WARN]  agent: Check is now critical: check=service:web
2021-07-14T11:16:35.065+0800 [WARN]  agent: Check is now critical: check=service:web
2021-07-14T11:16:45.106+0800 [WARN]  agent: Check is now critical: check=service:web
2021-07-14T11:16:55.147+0800 [WARN]  agent: Check is now critical: check=service:web
2021-07-14T11:16:56.290+0800 [WARN]  agent.auto_config: skipping file /etc/consul.d/consul.env, extension must be .hcl or .json, or config format must be set
2021-07-14T11:16:56.290+0800 [WARN]  agent.auto_config: using enable-script-checks without ACLs and without allow_write_http_from is DANGEROUS, use enable-local-script-checks instead, see https://www.hashicorp.com/blog/protecting-consul-from-rce-risk-in-specific-configurations/
2021-07-14T11:16:56.290+0800 [WARN]  agent: DEPRECATED Backwards compatibility with pre-1.9 metrics enabled. These metrics will be removed in a future version of Consul. Set `telemetry {
     
      disable_compat_1.9 = true }` to disable them.
2021-07-14T11:16:56.301+0800 [INFO]  agent: Deregistered service: service=web

7、启动ui服务:(生产环境建议去掉ui服务,用cli运维即可)

上面启动consul server和client时,没有启动ui服务。这里选取某个节点(例如server3)重新启动,启动时指定ui打开服务:

先杀死consul服务:

root@jacky-VirtualBox:~# ps -aux | grep consul
root        1941  1.9  2.9 781836 74308 pts/0    Sl   12:40   0:21 consul agent -server -ui -data-dir /data/consul0 -node=server3  -bind=192.168.1.49 -config-dir /etc/consul.d -enable-script-checks=true -datacenter=dc1 -join 192.168.1.47
root        2221  0.0  0.0  17676   728 pts/0    S+   12:58   0:00 grep --color=auto consul
root@jacky-VirtualBox:~# 
root@jacky-VirtualBox:~# 
root@jacky-VirtualBox:~# 
root@jacky-VirtualBox:~# kill 1941
root@jacky-VirtualBox:~# 
[3]+  Exit 1                  nohup consul agent -server -ui -data-dir /data/consul0 -node=server3  -bind=192.168.2.122 -config-dir /etc/consul.d -enable-script-checks=true -datacenter=dc1 -join 192.168.2.120 > /data/consul0/consul.log 2>&1
root@jacky-VirtualBox:~#

再次启动consul server3:

nohup consul agent -server -ui -data-dir /data/consul0 -node=server3 -bind=192.168.1.49 -config-dir /etc/consul.d -enable-script-checks=true -datacenter=dc1 -join 192.168.1.47  > /data/consul0/consul.log 2>&1 &

在windows 浏览器输入http://=192.168.1.49:8500/ui,发现无法查看,这……。
遇到问题不要慌,把手机拿出来发个朋友圈……
我们去server3节点查看一下上面的ui服务的监听端口,发现是监听在127.0.0.1:8500

root@jacky-VirtualBox:~# lsof -i:8500
COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
consul  1941 root   22u  IPv4  35022      0t0  TCP localhost:8500 (LISTEN)
root@jacky-VirtualBox:~#
root@jacky-VirtualBox:~# curl http://localhost:8500/ui
<a href="/ui/">Moved Permanently</a>.

root@jacky-VirtualBox:~#

通过cli的测试发现2个问题:第一个,consul 的ui服务默认只监听本地的请求。第二,服务发生了重定向。

有2种方法可以解决这个问题:1,在server3上面部署一个tcp代理服务器,例如监听0.0.0.0:9009端口,把其他ip对192.168.1.49:9009/ui 的请求转发给127.0.0.1:8500/ui。2,指定server3 ui的监听地址:

nohup consul agent -server -ui -client=0.0.0.0 -data-dir /data/consul0 -node=server3 -bind=192.168.1.49 -config-dir /etc/consul.d -enable-script-checks=true -datacenter=dc1 -rejoin 192.168.1.47 > /data/consul0/consul.log 2>&1 &

查看端口监听地址,已经变成了0.0.0.0:8500:

root@jacky-VirtualBox:~# lsof -i:8500
COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
consul  2228 root   22u  IPv6  37271      0t0  TCP *:8500 (LISTEN)
root@jacky-VirtualBox:~#

结果与上面的分析一致,在windows查看ui web:
在这里插入图片描述

用代理的方式访问ui:
代理服务源码:

package main

import (
	"fmt"
	"io"
	"net"
	"os"
	"strings"
	"sync"

	"github.com/urfave/cli/v2"
	"golang.org/x/sys/unix"
)

type tcpproxy struct {
    
    
	lock sync.Mutex
	dsts []string
	src  string
}

var TcpProxy = &cli.Command{
    
    
	Name:      "tcpproxy",
	Aliases:   []string{
    
    ""},
	Usage:     "tcp port proxy",
	UsageText: "tcpproxy [--src=0.0.0.0:7777] [--dst=136.19.188.100:9999,136.19.188.110:9999,136.19.188.120:9999]",
	Flags: []cli.Flag{
    
    
		&cli.StringFlag{
    
    
			Name:   "src",
			Hidden: true,
		},
		&cli.StringFlag{
    
    
			Name:   "dst",
			Hidden: true,
		},
	},

	Action: func(cctx *cli.Context) error {
    
    
		src := cctx.String("src")
		dst := cctx.String("dst")
		if src == "" || dst == "" {
    
    
			fmt.Println(cctx.Command.UsageText)
			return nil
		}

		tp := &tcpproxy{
    
    
			src:  src,
			dsts: strings.Split(dst, ","),
		}
		tp.server()

		return nil
	},
}

func main() {
    
    
	local := []*cli.Command{
    
    
		TcpProxy,
	}

	app := &cli.App{
    
    
		Name:                 "proxy",
		Usage:                "proxy tcpproxy",
		Version:              "v0.0.1",
		EnableBashCompletion: true,
		Flags: []cli.Flag{
    
    
			&cli.StringFlag{
    
    
				Name:    "configfile",
				EnvVars: []string{
    
    ""},
				Hidden:  true,
				Value:   "cfg.toml",
			},
		},
		Commands: local,
	}

	if err := app.Run(os.Args); err != nil {
    
    
		fmt.Fprintf(os.Stderr, "ERROR: %s\n\n", err) // nolint:errcheck
		os.Exit(1)
	}
}

func unixSetLimit(soft uint64, max uint64) error {
    
    
	rlimit := unix.Rlimit{
    
    
		Cur: soft,
		Max: max,
	}
	return unix.Setrlimit(unix.RLIMIT_NOFILE, &rlimit)
}

func (p *tcpproxy) server() {
    
    
	unixSetLimit(60000, 60000)
	listen, err := net.Listen("tcp", p.src)
	if err != nil {
    
    
		fmt.Println(err)
		return
	}
	defer listen.Close()
	fmt.Println("listen at:", p.src)
	for {
    
    
		conn, err := listen.Accept()
		if err != nil {
    
    
			fmt.Printf("接受客户端连接错误:%v\n", err)
			continue
		}
		fmt.Println("build new proxy connect. ", "client address =", conn.RemoteAddr(), " local server address=", conn.LocalAddr())
		go p.handle(conn)
	}
}

func (p *tcpproxy) handle(sconn net.Conn) {
    
    
	defer sconn.Close()
	dst, ok := p.select_dst()
	if !ok {
    
    
		return
	}
	dconn, err := net.Dial("tcp", dst)
	if err != nil {
    
    
		fmt.Printf("连接%v失败:%v\n", dst, err)
		return
	}
	defer dconn.Close()

	ExitChan := make(chan bool, 1)
	// 转发到目标服务器
	go func(sconn net.Conn, dconn net.Conn, Exit chan bool) {
    
    
		_, err := io.Copy(dconn, sconn)
		if err != nil {
    
    
			fmt.Printf("往%v发送数据失败:%v\n", dst, err)
			ExitChan <- true
		}
	}(sconn, dconn, ExitChan)

	// 从目标服务器返回数据到客户端
	go func(sconn net.Conn, dconn net.Conn, Exit chan bool) {
    
    
		_, err := io.Copy(sconn, dconn)
		if err != nil {
    
    
			fmt.Printf("从%v接收数据失败:%v\n", dst, err)
			ExitChan <- true
		}
	}(sconn, dconn, ExitChan)
	<-ExitChan
}

// ip 轮询
func (p *tcpproxy) select_dst() (string, bool) {
    
    
	p.lock.Lock()
	defer p.lock.Unlock()

	if len(p.dsts) < 1 {
    
    
		fmt.Println("failed select_dst()")
		return "", false
	}
	dst := p.dsts[0]
	p.dsts = append(p.dsts[1:], dst)
	return dst, true
}

root@jacky-VirtualBox:~/test/proxy# tree
.
├── go.mod
├── go.sum
├── main.go
└── proxy

0 directories, 4 files
root@jacky-VirtualBox:~/test/proxy#
root@jacky-VirtualBox:~/test/proxy# go mod tidy
go: finding module for package github.com/urfave/cli/v2
go: downloading github.com/urfave/cli/v2 v2.3.0
go: downloading github.com/urfave/cli v1.22.5
go: finding module for package golang.org/x/sys/unix
go: downloading golang.org/x/sys v0.0.0-20210630005230-0f9fa26af87c
go: found github.com/urfave/cli/v2 in github.com/urfave/cli/v2 v2.3.0
go: found golang.org/x/sys/unix in golang.org/x/sys v0.0.0-20210630005230-0f9fa26af87c
go: downloading github.com/cpuguy83/go-md2man/v2 v2.0.0-20190314233015-f79a8a8ca69d
root@jacky-VirtualBox:~/test/proxy# go build
root@jacky-VirtualBox:~/test/proxy# ll
total 5176
drwxr-xr-x 2 root root    4096 7月  14 12:32 ./
drwxr-xr-x 9 root root    4096 7月  14 12:19 ../
-rw-r--r-- 1 root root     121 7月  14 12:32 go.mod
-rw-r--r-- 1 root root    1454 7月  14 12:32 go.sum
-rw-r--r-- 1 root root    3066 7月  14 12:20 main.go
-rwxr-xr-x 1 root root 5278425 7月  14 12:32 proxy*
root@jacky-VirtualBox:~/test/proxy# 
root@jacky-VirtualBox:~/test/proxy# nohup ./proxy tcpproxy --src=0.0.0.0:9009 --dst=localhost:8500 > /tmp/proxy.log 2>&1 &
root@jacky-VirtualBox:~/test/proxy# tail -f /tmp/proxy.log
listen at: 0.0.0.0:9009
build new proxy connect.  client address = 192.168.1.30:63053  local server address= 192.168.1.49:9009
build new proxy connect.  client address = 192.168.1.30:52724  local server address= 192.168.1.49:9009
build new proxy connect.  client address = 192.168.1.30:61155  local server address= 192.168.1.49:9009
build new proxy connect.  client address = 192.168.1.30:62671  local server address= 192.168.1.49:9009
build new proxy connect.  client address = 192.168.1.30:64189  local server address= 192.168.1.49:9009
build new proxy connect.  client address = 192.168.1.30:57534  local server address= 192.168.1.49:9009

访问ui web的代理端口:
在这里插入图片描述

7、mongodb高可用集群设计

猜你喜欢

转载自blog.csdn.net/jacky128256/article/details/118713265