Web load balancer architecture plays in a very important role, is used to distribute traffic to multiple back-end load, improve the scalability of the service. Behind a load balancer configured with multiple services, when a service fails, the load balancer can quickly select another available service, so the overall service availability has improved.

Since research works load balancer

The load balancer can use several strategies in the back-end services to distribute the traffic load.

Polling (Round Robin) - evenly distribute the traffic load is assumed that all have the same back-end services processing capabilities;
WRR (Weighted Round Robin) - weighted with the processing capability of the back-end service;
Least Connections (Least Connections) - give priority to distribute the traffic load is connected minimal backend.

I intend to implement the simplest strategy, namely polling.

First, select polling

Polling principle is very simple, back-end services have equal access to processing tasks.

As shown above, the polling process is continuous cycle, but we can not directly use this method.

If one of the back-end failure how to do? We certainly do not want to direct traffic to it. We can only route traffic to the normal operation of the service.

Second, the definition of the structure

We need to know the status of all the back-end servers, such as a service is dead or alive, but also track their url.

We can define a structure to hold information back end.

type Backend struct {
  URL          *url.URL
  Alive        bool
  mux          sync.RWMutex
  ReverseProxy *httputil.ReverseProxy
}

　　We also need a way to keep track of all the back-end, and a calculator variable.

type ServerPool struct {
  backends []*Backend
  current  uint64
}

　　三、使用 ReverseProxy

之前说过，负载均衡器的作用是将流量负载分发到后端的服务器上，并将结果返回给客户端。

根据 Go 语言文档的描述：

这刚好是我们想要的，所以我们没有必要重复发明轮子。我们可以直接使用 ReverseProxy 来中继初始请求。

u, _ := url.Parse("http://localhost:8080")
rp := httputil.NewSingleHostReverseProxy(u)

// 初始化服务器，并添加处理器
http.HandlerFunc(rp.ServeHTTP)

我们使用 httputil.NewSingleHostReverseProxy(url) 初始化一个反向代理，这个反向代理可以将请求中继到指定的 url。在上面的例子中，所有的请求都会被中继到 localhost:8080，结果被发送给初始客户端。

如果看一下 ServeHTTP 方法的签名，我们会发现它返回的是一个 HTTP 处理器，所以我们可以将它传给 http 的 HandlerFunc。

在我们的例子中，可以使用 Backend 里的 URL 来初始化 ReverseProxy，这样反向代理就会把请求路由给指定的 URL。

四、选择的过程

在选择下一个服务器时，我们需要跳过已经死掉的服务器，但不管怎样，我们都需要一个计数器。

因为有很多客户端连接到负载均衡器，所以发生竟态条件是不可避免的。为了防止这种情况，我们需要使用 mutex 给 ServerPool 加锁。但这样做对性能会有影响，更何况我们并不是真想要给 ServerPool 加锁，我们只是想要更新计数器。

最理想的解决方案是使用原子操作，Go 语言的 atomic 包为此提供了很好的支持。

func (s *ServerPool) NextIndex() int {
  return int(atomic.AddUint64(&s.current, uint64(1)) % uint64(len(s.backends)))
}

　　我们通过原子操作递增 current 的值，并通过对 slice 的长度取模来获得当前索引值。所以，返回值总是介于 0 和 slice 的长度之间，毕竟我们想要的是索引值，而不是总的计数值。

五、选择可用的后端

我们需要循环将请求路由到后端的每一台服务器上，但要跳过已经死掉的服务。

GetNext() 方法总是返回一个介于 0 和 slice 长度之间的值，如果这个值对应的服务器不可用，我们需要遍历一遍 slice。

遍历一遍 slice

如上图所示，我们将从 next 位置开始遍历整个列表，但在选择索引时，需要保证它处在 slice 的长度之内，这个可以通过取模运算来保证。

在找到可用的服务器后，我们将它标记为当前可用服务器。

上述操作对应的代码如下。

// GetNextPeer 返回下一个可用的服务器
func (s *ServerPool) GetNextPeer() *Backend {
  // 遍历后端列表，找到可用的服务器
  next := s.NextIndex()
  l := len(s.backends) + next // 从 next 开始遍历
  for i := next; i < l; i++ {
    idx := i % len(s.backends) // 通过取模计算获得索引
    // 如果找到一个可用的服务器，将它作为当前服务器。如果不是初始的那个，就把它保存下来
    if s.backends[idx].IsAlive() {
      if i != next {
        atomic.StoreUint64(&s.current, uint64(idx)) // 标记当前可用服务器
      }
      return s.backends[idx]
    }
  }
  return nil
}

六、避免竟态条件

我们还需要考虑到一些情况，比如不同的 goroutine 会同时访问 Backend 结构体里的一个变量。

我们知道，读取这个变量的 goroutine 比修改这个变量的要多，所以我们使用 RWMutex 来串行化对 Alive 的访问操作。

// SetAlive
func (b *Backend) SetAlive(alive bool) {
  b.mux.Lock()
  b.Alive = alive
  b.mux.Unlock()
}

// 如果后端还活着，IsAlive 返回 true
func (b *Backend) IsAlive() (alive bool) {
  b.mux.RLock()
  alive = b.Alive
  b.mux.RUnlock()
  return
}

　　七、对请求进行负载均衡

在有了上述的这些东西之后，接下来就可以用下面这个简单的办法来对请求进行负载均衡。只有当所有的后端服务都死掉它才会退出。

// lb 对入向请求进行负载均衡
func lb(w http.ResponseWriter, r *http.Request) {
  peer := serverPool.GetNextPeer()
  if peer != nil {
    peer.ReverseProxy.ServeHTTP(w, r)
    return
  }
  http.Error(w, "Service not available", http.StatusServiceUnavailable)
}

这个方法可以作为 HandlerFunc 传给 http 服务器。

server := http.Server{
  Addr:    fmt.Sprintf(":%d", port),
  Handler: http.HandlerFunc(lb),
}

八、只将流量路由给活跃的服务器

现在的 lb 方法存在一个严重的问题，我们并不知道后端服务是否处于正常的运行状态。为此，我们需要尝试发送请求，检查一下它是否正常。

我们可以通过两种方法来达到目的：

主动（Active）：在处理当前请求时，如果发现当前的后端没有响应，就把它标记为已宕机。
被动（Passive）：在固定的时间间隔内对后端服务器执行 ping 操作，以此来检查服务器的状态。

九、主动模式

在发生错误时，ReverseProxy 会触发 ErrorHandler 回调函数，我们可以利用它来检查故障。

proxy.ErrorHandler = func(writer http.ResponseWriter, request *http.Request, e error) {
  log.Printf("[%s] %s\n", serverUrl.Host, e.Error())
  retries := GetRetryFromContext(request)
  if retries < 3 {
    select {
      case <-time.After(10 * time.Millisecond):
        ctx := context.WithValue(request.Context(), Retry, retries+1)
        proxy.ServeHTTP(writer, request.WithContext(ctx))
      }
      return
    }

  // 在三次重试之后，把这个后端标记为宕机
  serverPool.MarkBackendStatus(serverUrl, false)

  // 同一个请求在尝试了几个不同的后端之后，增加计数
  attempts := GetAttemptsFromContext(request)
  log.Printf("%s(%s) Attempting retry %d\n", request.RemoteAddr, request.URL.Path, attempts)
  ctx := context.WithValue(request.Context(), Attempts, attempts+1)
  lb(writer, request.WithContext(ctx))
}

我们使用强大的闭包来实现错误处理器，它可以捕获外部变量错误。它会检查重试次数，如果小于 3，就把同一个请求发送给同一个后端服务器。之所以要进行重试，是因为服务器可能会发生临时错误，在经过短暂的延迟（比如服务器没有足够的 socket 来接收请求）之后，服务器又可以继续处理请求。我们使用了一个计时器，把重试时间间隔设定在 10 毫秒左右。

在重试失败之后，我们就把这个后端标记为宕机。

接下来，我们要找出新的可用后端。我们使用 context 来维护重试次数。在增加重试次数后，我们把它传回 lb，选择一个新的后端来处理请求。

但我们不能不加以限制，所以我们会在进一步处理请求之前检查是否达到了最大的重试上限。

我们从请求里拿到重试次数，如果已经达到最大上限，就终结这个请求。

// lb 对传入的请求进行负载均衡
func lb(w http.ResponseWriter, r *http.Request) {
  attempts := GetAttemptsFromContext(r)
  if attempts > 3 {
    log.Printf("%s(%s) Max attempts reached, terminating\n", r.RemoteAddr, r.URL.Path)
    http.Error(w, "Service not available", http.StatusServiceUnavailable)
    return
  }

  peer := serverPool.GetNextPeer()
  if peer != nil {
    peer.ReverseProxy.ServeHTTP(w, r)
    return
  }
  http.Error(w, "Service not available", http.StatusServiceUnavailable)
}

　　十、context 的使用

我们可以利用 context 在 http 请求中保存有用的信息，用它来跟踪重试次数。

首先，我们需要为 context 指定键。我们建议使用不冲突的整数值作为键，而不是字符串。Go 语言提供了 iota 关键字，可以用来实现递增的常量，每一个常量都包含了唯一值。这是一种完美的整型键解决方案。

const (
  Attempts int = iota
  Retry
)

然后我们就可以像操作 HashMap 那样获取这个值。默认返回值要视情况而定。

// GetAttemptsFromContext 返回尝试次数
func GetRetryFromContext(r *http.Request) int {
  if retry, ok := r.Context().Value(Retry).(int); ok {
    return retry
  }
  return 0
}

　　十一、被动模式

被动模式就是定时对后端执行 ping 操作，以此来检查它们的状态。

我们通过建立 TCP 连接来执行 ping 操作。如果后端及时响应，我们就认为它还活着。当然，如果你喜欢，也可以改成直接调用某个端点，比如 /status。切记，在执行完操作后要关闭连接，避免给服务器造成额外的负担，否则服务器会一直维护连接，最后把资源耗尽。

// isAlive 通过建立 TCP 连接检查后端是否还活着
func isBackendAlive(u *url.URL) bool {
  timeout := 2 * time.Second
  conn, err := net.DialTimeout("tcp", u.Host, timeout)
  if err != nil {
    log.Println("Site unreachable, error: ", err)
    return false
  }
  _ = conn.Close() // 不需要维护连接，把它关闭
  return true
}

现在我们可以遍历服务器，并标记它们的状态。

// HealthCheck 对后端执行 ping 操作，并更新状态
func (s *ServerPool) HealthCheck() {
  for _, b := range s.backends {
    status := "up"
    alive := isBackendAlive(b.URL)
    b.SetAlive(alive)
    if !alive {
      status = "down"
    }
    log.Printf("%s [%s]\n", b.URL, status)
  }
}

我们可以启动定时器来定时发起 ping 操作。

// healthCheck 返回一个 routine，每 2 分钟检查一次后端的状态
func healthCheck() {
  t := time.NewTicker(time.Second * 20)
  for {
    select {
    case <-t.C:
      log.Println("Starting health check...")
      serverPool.HealthCheck()
      log.Println("Health check completed")
    }
  }
}

在上面的例子中，<-t.C 每 20 秒返回一个值，select 会检测到这个事件。在没有 default case 的情况下，select 会一直等待，直到有满足条件的 case 被执行。

最后，使用单独的 goroutine 来执行。

go healthCheck()

十二、测试

负载均衡代码

  1 package main
  2 
  3 import (
  4     "context"
  5     "flag"
  6     "fmt"
  7     "log"
  8     "net"
  9     "net/http"
 10     "net/http/httputil"
 11     "net/url"
 12     "strings"
 13     "sync"
 14     "sync/atomic"
 15     "time"
 16 )
 17 
 18 const (
 19     Attempts int = iota
 20     Retry
 21 )
 22 
 23 //定义结构体
 24 //后端保存关于服务器的数据
 25 type Backend struct {
 26     URL          *url.URL
 27     Alive        bool
 28     mux          sync.RWMutex
 29     ReverseProxy *httputil.ReverseProxy
 30 }
 31 
 32 //跟踪所有后端，以及一个计算器变量
 33 type ServerPool struct {
 34     backends []*Backend
 35     current  uint64
 36 }
 37 
 38 // SetAlive
 39 func (b *Backend) SetAlive(alive bool) {
 40     b.mux.Lock()
 41     b.Alive = alive
 42     b.mux.Unlock()
 43 }
 44 
 45 // 如果后端还活着，IsAlive 返回 true
 46 func (b *Backend) IsAlive() (alive bool) {
 47     b.mux.RLock()
 48     alive = b.Alive
 49     b.mux.RUnlock()
 50     return
 51 }
 52 
 53 // lb 对入向请求进行负载均衡
 54 func lb(w http.ResponseWriter, r *http.Request) {
 55     //重试次数，如果已经达到最大上限，就终结这个请求
 56     attempts := GetAttemptsFromContext(r)
 57     if attempts > 3 {
 58         log.Printf("%s(%s) Max attempts reached, terminating\n", r.RemoteAddr, r.URL.Path)
 59         http.Error(w, "Service not available", http.StatusServiceUnavailable)
 60         return
 61     }
 62 
 63     peer := serverPool.GetNextPeer()
 64     if peer != nil {
 65         peer.ReverseProxy.ServeHTTP(w, r)
 66         return
 67     }
 68     http.Error(w, "Service not available", http.StatusServiceUnavailable)
 69 }
 70 
 71 // 自动增加计数器并返回一个索引，使用atomic 保证原子性
 72 //通过原子操作递增 current 的值，并通过对 slice 的长度取模来获得当前索引值。所以，返回值总是介于 0 和 slice 的长度之间，毕竟我们想要的是索引值，而不是总的计数值。
 73 func (s *ServerPool) NextIndex() int {
 74     return int(atomic.AddUint64(&s.current, uint64(1)) % uint64(len(s.backends)))
 75 }
 76 
 77 // GetNextPeer返回下一个活动的对等点以获取连接
 78 //找到可用的服务器后，我们将它标记为当前可用服务器。
 79 func (s *ServerPool) GetNextPeer() *Backend {
 80     // 循环整个后端，找出一个活动后端
 81     next := s.NextIndex()
 82     l := len(s.backends) + next // 从next开始移动一个完整的周期
 83     for i := next; i < l; i++ {
 84         idx := i % len(s.backends)     // take an index by modding
 85         if s.backends[idx].IsAlive() { // if we have an alive backend, use it and store if its not the original one
 86             if i != next {
 87                 atomic.StoreUint64(&s.current, uint64(idx))
 88             }
 89             return s.backends[idx]
 90         }
 91     }
 92     return nil
 93 }
 94 
 95 // GetAttemptsFromContext 返回尝试次数
 96 func GetRetryFromContext(r *http.Request) int {
 97     if retry, ok := r.Context().Value(Retry).(int); ok {
 98         return retry
 99     }
100     return 0
101 }
102 
103 // healthCheck runs a routine for check status of the backends every 2 mins
104 // healthCheck 返回一个 routine，每 2 分钟检查一次后端的状态
105 func healthCheck() {
106     t := time.NewTicker(time.Second * 20)
107     for {
108         select {
109         case <-t.C:
110             log.Println("Starting health check...")
111             serverPool.HealthCheck()
112             log.Println("Health check completed")
113         }
114     }
115 }
116 
117 // HealthCheck ping后端并更新状态
118 func (s *ServerPool) HealthCheck() {
119     for _, b := range s.backends {
120         status := "up"
121         alive := isBackendAlive(b.URL)
122         b.SetAlive(alive)
123         if !alive {
124             status = "down"
125         }
126         log.Printf("%s [%s]\n", b.URL, status)
127     }
128 }
129 
130 // isAlive checks whether a backend is Alive by establishing a TCP connection
131 // isAlive 通过建立 TCP 连接检查后端是否还活着
132 func isBackendAlive(u *url.URL) bool {
133     timeout := 2 * time.Second
134     conn, err := net.DialTimeout("tcp", u.Host, timeout)
135     if err != nil {
136         log.Println("Site unreachable, error: ", err)
137         return false
138     }
139     _ = conn.Close() // 不需要维护连接，把它关闭
140     return true
141 }
142 
143 // GetAttemptsFromContext returns the attempts for request
144 func GetAttemptsFromContext(r *http.Request) int {
145     if attempts, ok := r.Context().Value(Attempts).(int); ok {
146         return attempts
147     }
148     return 1
149 }
150 
151 // AddBackend to the server pool
152 func (s *ServerPool) AddBackend(backend *Backend) {
153     s.backends = append(s.backends, backend)
154 }
155 
156 // MarkBackendStatus changes a status of a backend
157 func (s *ServerPool) MarkBackendStatus(backendURL *url.URL, alive bool) {
158     for _, b := range s.backends {
159         if b.URL.String() == backendURL.String() {
160             b.SetAlive(alive)
161             break
162         }
163     }
164 }
165 
166 var serverPool ServerPool
167 
168 func main() {
169     var serverList string
170     var port int
171     flag.StringVar(&serverList, "backends", "http://localhost:3302,http://localhost:3303,http://localhost:3304", "Load balanced backends, use commas to separate")
172     flag.IntVar(&port, "port", 3031, "Port to serve")
173     flag.Parse()
174 
175     if len(serverList) == 0 {
176         log.Fatal("Please provide one or more backends to load balance")
177     }
178 
179     // 解析服务器
180     tokens := strings.Split(serverList, ",")
181     //range类似迭代器，可以遍历
182     for _, tok := range tokens {
183         serverURL, err := url.Parse(tok)
184         if err != nil {
185             log.Fatal(err)
186         }
187 
188         //使用 httputil.NewSingleHostReverseProxy(url) 初始化一个反向代理
189         proxy := httputil.NewSingleHostReverseProxy(serverURL)
190 
191         //在发生错误时，ReverseProxy 会触发 ErrorHandler 回调函数，我们可以利用它来检查故障。
192         proxy.ErrorHandler = func(writer http.ResponseWriter, request *http.Request, e error) {
193             log.Printf("[%s] %s\n", serverURL.Host, e.Error())
194             retries := GetRetryFromContext(request)
195             if retries < 3 {
196                 select {
197                 case <-time.After(10 * time.Millisecond):
198                     ctx := context.WithValue(request.Context(), Retry, retries+1)
199                     proxy.ServeHTTP(writer, request.WithContext(ctx))
200                 }
201                 return
202             }
203 
204             // 在三次重试之后，把这个后端标记为宕机
205             serverPool.MarkBackendStatus(serverURL, false)
206 
207             // 同一个请求在尝试了几个不同的后端之后，增加计数
208             attempts := GetAttemptsFromContext(request)
209             log.Printf("%s(%s) Attempting retry %d\n", request.RemoteAddr, request.URL.Path, attempts)
210             ctx := context.WithValue(request.Context(), Attempts, attempts+1)
211             lb(writer, request.WithContext(ctx))
212         }
213 
214         serverPool.AddBackend(&Backend{
215             URL:          serverURL,
216             Alive:        true,
217             ReverseProxy: proxy,
218         })
219         log.Printf("Configured server: %s\n", serverURL)
220 
221     }
222     // 初始化服务器，并添加处理器
223     // create http server
224     server := http.Server{
225         Addr:    fmt.Sprintf(":%d", port),
226         Handler: http.HandlerFunc(lb),
227     }
228 
229     // start health checking
230     go healthCheck()
231 
232     log.Printf("Load Balancer started at :%d\n", port)
233     if err := server.ListenAndServe(); err != nil {
234         log.Fatal(err)
235     }
236 }

View Code

直接运行就好了

web服务器代码

package main

import (
    "flag"
    "fmt"
    "log"
    "net/http"
    "strconv"
)

func sayhelloName(w http.ResponseWriter, r *http.Request) {
    r.ParseForm()                                //解析参数，默认是不会解析的
    fmt.Fprintln(w, "Hello moon!")               //这个写入到w的是输出到客户端的
    fmt.Fprintln(w, "count:"+strconv.Itoa(port)) //这个写入到w的是输出到客户端的
    count++
    fmt.Fprintln(w, "count:"+strconv.Itoa(count)) //这个写入到w的是输出到客户端的

}

var port int
var count int

func main() {
    flag.IntVar(&port, "port", 3302, "duan端口号，默认3302")

    // 【必须调用】从 arguments 中解析注册的 flag
    flag.Parse()
    fmt.Printf("port=%v \n", port)
    http.HandleFunc("/", sayhelloName)                      //设置访问的路由
    err := http.ListenAndServe(":"+strconv.Itoa(port), nil) //设置监听的端口
    if err != nil {
        log.Fatal("ListenAndServe: ", err)
    }
}

View Code

使用方法

go run web.go -port=3302
go run web.go -port=3303
go run web.go -port=3304

这里web.go是代码文件名

测试

访问http://localhost:3031/并刷新

十三、结论

这篇文章提到了很多东西：

轮询；
Go 语言标准库里的 ReverseProxy；
mutex；
原子操作；
闭包；
回调；
select。

这个简单的负载均衡器还有很多可以改进的地方：

使用堆来维护后端的状态，以此来降低搜索成本；
收集统计信息；
实现加权轮询或最少连接策略；
支持文件配置。

代码地址：

https://github.com/kasvith/simplelb/

原文连接：

https://kasvith.github.io/posts/lets-create-a-simple-lb-go/

Using the Go language freehand line and a load balancer

Since research works load balancer

Guess you like