Reload configuration and the same is that we also need to inform the server restarted with the signal, but the key is smooth restart, if only a simple reboot, just kill off, and then pull up to. Graceful Restart means that server upgrade when you can not stop business.
Let's look at the library there is no corresponding solution to this problem on Github, and then locate the following three libraries:
- facebookgo/grace - Graceful restart & zero downtime deploy for Go servers.
- fvbock/endless - Zero downtime restarts for go servers (Drop in replacement for http.ListenAndServe)
- jpillora/overseer - Monitorable, gracefully restarting, self-upgrading binaries in Go (golang)
We were to learn about, the following will explain restart the http server.
Use
Let's use the library to do each of those three things GR, then compare the advantages and disadvantages.
The three libraries official gave appropriate examples, examples are as follows:
However, examples of three libraries official is not consistent, we have to unify it:
- grace example https://github.com/facebookgo/grace/blob/master/gracedemo/demo.go
- endless examples https://github.com/fvbock/endless/tree/master/examples
- overseer example https://github.com/jpillora/overseer/tree/master/example
Examples we refer to the official were to write down examples for comparison:
grace
package main
import (
"time"
"net/http"
"github.com/facebookgo/grace/gracehttp"
)
func main() {
gracehttp.Serve(
&http.Server{Addr: ":5001", Handler: newGraceHandler()},
&http.Server{Addr: ":5002", Handler: newGraceHandler()},
)
}
func newGraceHandler() http.Handler {
mux := http.NewServeMux()
mux.HandleFunc("/sleep", func(w http.ResponseWriter, r *http.Request) {
duration, err := time.ParseDuration(r.FormValue("duration"))
if err != nil {
http.Error(w, err.Error(), 400)
return
}
time.Sleep(duration)
w.Write([]byte("Hello World"))
})
return mux
}
endless
package main
import (
"log"
"net/http"
"os"
"sync"
"time"
"github.com/fvbock/endless"
"github.com/gorilla/mux"
)
func handler(w http.ResponseWriter, r *http.Request) {
duration, err := time.ParseDuration(r.FormValue("duration"))
if err != nil {
http.Error(w, err.Error(), 400)
return
}
time.Sleep(duration)
w.Write([]byte("Hello World"))
}
func main() {
mux1 := mux.NewRouter()
mux1.HandleFunc("/sleep", handler)
w := sync.WaitGroup{}
w.Add(2)
go func() {
err := endless.ListenAndServe(":5003", mux1)
if err != nil {
log.Println(err)
}
log.Println("Server on 5003 stopped")
w.Done()
}()
go func() {
err := endless.ListenAndServe(":5004", mux1)
if err != nil {
log.Println(err)
}
log.Println("Server on 5004 stopped")
w.Done()
}()
w.Wait()
log.Println("All servers stopped. Exiting.")
os.Exit(0)
}
overseer
package main
import (
"fmt"
"net/http"
"time"
"github.com/jpillora/overseer"
)
//see example.sh for the use-case
// BuildID is compile-time variable
var BuildID = "0"
//convert your 'main()' into a 'prog(state)'
//'prog()' is run in a child process
func prog(state overseer.State) {
fmt.Printf("app#%s (%s) listening...\n", BuildID, state.ID)
http.Handle("/", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
duration, err := time.ParseDuration(r.FormValue("duration"))
if err != nil {
http.Error(w, err.Error(), 400)
return
}
time.Sleep(duration)
w.Write([]byte("Hello World"))
fmt.Fprintf(w, "app#%s (%s) says hello\n", BuildID, state.ID)
}))
http.Serve(state.Listener, nil)
fmt.Printf("app#%s (%s) exiting...\n", BuildID, state.ID)
}
//then create another 'main' which runs the upgrades
//'main()' is run in the initial process
func main() {
overseer.Run(overseer.Config{
Program: prog,
Addresses: []string{":5005", ":5006"},
//Fetcher: &fetcher.File{Path: "my_app_next"},
Debug: false, //display log of overseer actions
})
}
Procedure comparative example
- The above examples were constructed and recorded pid
- Call API, when it is not returned, modify the content (Hello World -> Hello Harry), rebuild. Check whether the old API returns the old content
- Call the new API, whether the content is to see the return of new content
- View pid currently running is consistent with previous
Here to tell about the operation command
# 第一次构建项目
go build grace.go
# 运行项目,这时就可以做内容修改了
./grace &
# 请求项目,60s后返回
curl "http://127.0.0.1:5001/sleep?duration=60s" &
# 再次构建项目,这里是新内容
go build grace.go
# 重启,2096为pid
kill -USR2 2096
# 新API请求
curl "http://127.0.0.1:5001/sleep?duration=1s"
# 第一次构建项目
go build endless.go
# 运行项目,这时就可以做内容修改了
./endless &
# 请求项目,60s后返回
curl "http://127.0.0.1:5003/sleep?duration=60s" &
# 再次构建项目,这里是新内容
go build endless.go
# 重启,22072为pid
kill -1 22072
# 新API请求
curl "http://127.0.0.1:5003/sleep?duration=1s"
# 第一次构建项目
go build -ldflags '-X main.BuildID=1' overseer.go
# 运行项目,这时就可以做内容修改了
./overseer &
# 请求项目,60s后返回
curl "http://127.0.0.1:5005/sleep?duration=60s" &
# 再次构建项目,这里是新内容,注意版本号不同了
go build -ldflags '-X main.BuildID=2' overseer.go
# 重启,28300为主进程pid
kill -USR2 28300
# 新API请求
curl "http://127.0.0.1:5005/sleep?duration=1s"
compare results
Examples | API returns the old value | The new API return value | Old pid | New pid | in conclusion |
---|---|---|---|---|---|
grace | Hello world | Hello Harry | 2096 | 3100 | Old API is not broken, it will execute the original logic, pid will change |
endless | Hello world | Hello Harry | 22072 | 22365 | Old API is not broken, it will execute the original logic, pid will change |
overseer | Hello world | Hello Harry | 28300 | 28300 | Old API is not broken, it will execute the original logic, does not change the main process pid |
Principle Analysis
It can be seen and endless grace is more like a.
Hot restart of principle is very simple, but more related to the transfer of some file system calls and handle between parent and child, and so the details.
Process divided into the following steps:
- Monitor signal (USR2)
- Upon receipt of the signal fork child process (using the same start command), the listening service socket file descriptor passed to the child process
- The child's parent listen socket, this time the parent and child processes can receive request
- After the child process is started successfully, the parent process to stop accepting new connections, waiting for the old connection process is completed (or overtime)
- Parent process exits, the upgrade is complete
overseer with grace and endless somewhat different, mainly two things:
- When the overseer added Fetcher, when Fetcher return a valid binary stream (io.Reader), the main process will save it to a temporary location and verify it, to replace the current binary file and starts.
Fetcher run in a goroutine in advance would be a good time to check intervals. Fetcher support File, GitHub, HTTP and S3 ways. Details can be viewed bag package fetcher - overseer added a master management process smooth restart. Child process handle connections, able to maintain unchanged the main process pid.
The following figure shows the very image
detail
- The parent process socket file descriptor passed to the child process via the command line or environment variables
- And use the same command-line parent process child process starts, it overwrites the old program for golang with the updated executable program
- server.Shutdown () method is close elegance of the new features go1.8
- server.Serve (l) when the Shutdown method returns immediately, blocking the Shutdown method to complete the context, the Shutdown method to write the main goroutine
Code
package main
import (
"context"
"errors"
"flag"
"log"
"net"
"net/http"
"os"
"os/exec"
"os/signal"
"syscall"
"time"
)
var (
server *http.Server
listener net.Listener
graceful = flag.Bool("graceful", false, "listen on fd open 3 (internal use only)")
)
func handler(w http.ResponseWriter, r *http.Request) {
time.Sleep(20 * time.Second)
w.Write([]byte("hello world233333!!!!"))
}
func main() {
flag.Parse()
http.HandleFunc("/hello", handler)
server = &http.Server{Addr: ":9999"}
var err error
if *graceful {
log.Print("main: Listening to existing file descriptor 3.")
// cmd.ExtraFiles: If non-nil, entry i becomes file descriptor 3+i.
// when we put socket FD at the first entry, it will always be 3(0+3)
f := os.NewFile(3, "")
listener, err = net.FileListener(f)
} else {
log.Print("main: Listening on a new file descriptor.")
listener, err = net.Listen("tcp", server.Addr)
}
if err != nil {
log.Fatalf("listener error: %v", err)
}
go func() {
// server.Shutdown() stops Serve() immediately, thus server.Serve() should not be in main goroutine
err = server.Serve(listener)
log.Printf("server.Serve err: %v\n", err)
}()
signalHandler()
log.Printf("signal end")
}
func reload() error {
tl, ok := listener.(*net.TCPListener)
if !ok {
return errors.New("listener is not tcp listener")
}
f, err := tl.File()
if err != nil {
return err
}
args := []string{"-graceful"}
cmd := exec.Command(os.Args[0], args...)
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
// put socket FD at the first entry
cmd.ExtraFiles = []*os.File{f}
return cmd.Start()
}
func signalHandler() {
ch := make(chan os.Signal, 1)
signal.Notify(ch, syscall.SIGINT, syscall.SIGTERM, syscall.SIGUSR2)
for {
sig := <-ch
log.Printf("signal: %v", sig)
// timeout context for shutdown
ctx, _ := context.WithTimeout(context.Background(), 20*time.Second)
switch sig {
case syscall.SIGINT, syscall.SIGTERM:
// stop
log.Printf("stop")
signal.Stop(ch)
server.Shutdown(ctx)
log.Printf("graceful shutdown")
return
case syscall.SIGUSR2:
// reload
log.Printf("reload")
err := reload()
if err != nil {
log.Fatalf("graceful restart error: %v", err)
}
server.Shutdown(ctx)
log.Printf("graceful reload")
return
}
}
}
Reference code: https://github.com/CraryPrimitiveMan/go-in-action/tree/master/ch4
systemd & supervisor
After the parent exits, the child process will hang on to No. 1 above process. Supervisord use systemd and other management procedures in this case shows the progress in the failed state. There are two ways to solve this problem:
- Use pidfile, restart the process of updating it each time pidfile, so that managers perceive the process to change mainpid through this document.
- From a master to manage the service process, every warm restart master pull up a new process to kill off the old. Then the master pid does not change, the process for process managers in a normal state. A simple realization