Golang service Graceful Restart

Reload configuration and the same is that we also need to inform the server restarted with the signal, but the key is smooth restart, if only a simple reboot, just kill off, and then pull up to. Graceful Restart means that server upgrade when you can not stop business.

Let's look at the library there is no corresponding solution to this problem on Github, and then locate the following three libraries:

  • facebookgo/grace - Graceful restart & zero downtime deploy for Go servers.
  • fvbock/endless - Zero downtime restarts for go servers (Drop in replacement for http.ListenAndServe)
  • jpillora/overseer - Monitorable, gracefully restarting, self-upgrading binaries in Go (golang)

We were to learn about, the following will explain restart the http server.

Use

Let's use the library to do each of those three things GR, then compare the advantages and disadvantages.
The three libraries official gave appropriate examples, examples are as follows:

However, examples of three libraries official is not consistent, we have to unify it:

Examples we refer to the official were to write down examples for comparison:

grace

package main
 
import (
    "time"
    "net/http"
    "github.com/facebookgo/grace/gracehttp"
)
 
func main() {
    gracehttp.Serve(
        &http.Server{Addr: ":5001", Handler: newGraceHandler()},
        &http.Server{Addr: ":5002", Handler: newGraceHandler()},
    )
}
 
func newGraceHandler() http.Handler {
    mux := http.NewServeMux()
    mux.HandleFunc("/sleep", func(w http.ResponseWriter, r *http.Request) {
        duration, err := time.ParseDuration(r.FormValue("duration"))
        if err != nil {
            http.Error(w, err.Error(), 400)
            return
        }
        time.Sleep(duration)
        w.Write([]byte("Hello World"))
    })
    return mux
}

endless

package main
 
import (
    "log"
    "net/http"
    "os"
    "sync"
    "time"
 
    "github.com/fvbock/endless"
    "github.com/gorilla/mux"
)
 
func handler(w http.ResponseWriter, r *http.Request) {
    duration, err := time.ParseDuration(r.FormValue("duration"))
    if err != nil {
        http.Error(w, err.Error(), 400)
        return
    }
    time.Sleep(duration)
    w.Write([]byte("Hello World"))
}
 
func main() {
    mux1 := mux.NewRouter()
    mux1.HandleFunc("/sleep", handler)
 
    w := sync.WaitGroup{}
    w.Add(2)
    go func() {
        err := endless.ListenAndServe(":5003", mux1)
        if err != nil {
            log.Println(err)
        }
        log.Println("Server on 5003 stopped")
        w.Done()
    }()
    go func() {
        err := endless.ListenAndServe(":5004", mux1)
        if err != nil {
            log.Println(err)
        }
        log.Println("Server on 5004 stopped")
        w.Done()
    }()
    w.Wait()
    log.Println("All servers stopped. Exiting.")
 
    os.Exit(0)
}

overseer

package main
 
import (
    "fmt"
    "net/http"
    "time"
 
    "github.com/jpillora/overseer"
)
 
//see example.sh for the use-case
 
// BuildID is compile-time variable
var BuildID = "0"
 
//convert your 'main()' into a 'prog(state)'
//'prog()' is run in a child process
func prog(state overseer.State) {
    fmt.Printf("app#%s (%s) listening...\n", BuildID, state.ID)
    http.Handle("/", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        duration, err := time.ParseDuration(r.FormValue("duration"))
        if err != nil {
            http.Error(w, err.Error(), 400)
            return
        }
        time.Sleep(duration)
        w.Write([]byte("Hello World"))
        fmt.Fprintf(w, "app#%s (%s) says hello\n", BuildID, state.ID)
    }))
    http.Serve(state.Listener, nil)
    fmt.Printf("app#%s (%s) exiting...\n", BuildID, state.ID)
}
 
//then create another 'main' which runs the upgrades
//'main()' is run in the initial process
func main() {
    overseer.Run(overseer.Config{
        Program: prog,
        Addresses: []string{":5005", ":5006"},
        //Fetcher: &fetcher.File{Path: "my_app_next"},
        Debug:   false, //display log of overseer actions
    })
}

Procedure comparative example

  • The above examples were constructed and recorded pid
  • Call API, when it is not returned, modify the content (Hello World -> Hello Harry), rebuild. Check whether the old API returns the old content
  • Call the new API, whether the content is to see the return of new content
  • View pid currently running is consistent with previous

Here to tell about the operation command

# 第一次构建项目
go build grace.go
# 运行项目,这时就可以做内容修改了
./grace &
# 请求项目,60s后返回
curl "http://127.0.0.1:5001/sleep?duration=60s" &
# 再次构建项目,这里是新内容
go build grace.go
# 重启,2096为pid
kill -USR2 2096
# 新API请求
curl "http://127.0.0.1:5001/sleep?duration=1s"
 
 
# 第一次构建项目
go build endless.go
# 运行项目,这时就可以做内容修改了
./endless &
# 请求项目,60s后返回
curl "http://127.0.0.1:5003/sleep?duration=60s" &
# 再次构建项目,这里是新内容
go build endless.go
# 重启,22072为pid
kill -1 22072
# 新API请求
curl "http://127.0.0.1:5003/sleep?duration=1s"
 
 
# 第一次构建项目
go build -ldflags '-X main.BuildID=1' overseer.go
# 运行项目,这时就可以做内容修改了
./overseer &
# 请求项目,60s后返回
curl "http://127.0.0.1:5005/sleep?duration=60s" &
# 再次构建项目,这里是新内容,注意版本号不同了
go build -ldflags '-X main.BuildID=2' overseer.go
# 重启,28300为主进程pid
kill -USR2 28300
# 新API请求
curl "http://127.0.0.1:5005/sleep?duration=1s"

compare results

Examples API returns the old value The new API return value Old pid New pid in conclusion
grace Hello world Hello Harry 2096 3100 Old API is not broken, it will execute the original logic, pid will change
endless Hello world Hello Harry 22072 22365 Old API is not broken, it will execute the original logic, pid will change
overseer Hello world Hello Harry 28300 28300 Old API is not broken, it will execute the original logic, does not change the main process pid

Principle Analysis

It can be seen and endless grace is more like a.
Hot restart of principle is very simple, but more related to the transfer of some file system calls and handle between parent and child, and so the details.
Process divided into the following steps:

  1. Monitor signal (USR2)
  2. Upon receipt of the signal fork child process (using the same start command), the listening service socket file descriptor passed to the child process
  3. The child's parent listen socket, this time the parent and child processes can receive request
  4. After the child process is started successfully, the parent process to stop accepting new connections, waiting for the old connection process is completed (or overtime)
  5. Parent process exits, the upgrade is complete

overseer with grace and endless somewhat different, mainly two things:

  1. When the overseer added Fetcher, when Fetcher return a valid binary stream (io.Reader), the main process will save it to a temporary location and verify it, to replace the current binary file and starts.
    Fetcher run in a goroutine in advance would be a good time to check intervals. Fetcher support File, GitHub, HTTP and S3 ways. Details can be viewed bag package fetcher
  2. overseer added a master management process smooth restart. Child process handle connections, able to maintain unchanged the main process pid.

The following figure shows the very image
Here Insert Picture Description

detail

  • The parent process socket file descriptor passed to the child process via the command line or environment variables
  • And use the same command-line parent process child process starts, it overwrites the old program for golang with the updated executable program
  • server.Shutdown () method is close elegance of the new features go1.8
  • server.Serve (l) when the Shutdown method returns immediately, blocking the Shutdown method to complete the context, the Shutdown method to write the main goroutine

Code

package main

import (
    "context"
    "errors"
    "flag"
    "log"
    "net"
    "net/http"
    "os"
    "os/exec"
    "os/signal"
    "syscall"
    "time"
)

var (
    server   *http.Server
    listener net.Listener
    graceful = flag.Bool("graceful", false, "listen on fd open 3 (internal use only)")
)

func handler(w http.ResponseWriter, r *http.Request) {
    time.Sleep(20 * time.Second)
    w.Write([]byte("hello world233333!!!!"))
}

func main() {
    flag.Parse()

    http.HandleFunc("/hello", handler)
    server = &http.Server{Addr: ":9999"}

    var err error
    if *graceful {
        log.Print("main: Listening to existing file descriptor 3.")
        // cmd.ExtraFiles: If non-nil, entry i becomes file descriptor 3+i.
        // when we put socket FD at the first entry, it will always be 3(0+3)
        f := os.NewFile(3, "")
        listener, err = net.FileListener(f)
    } else {
        log.Print("main: Listening on a new file descriptor.")
        listener, err = net.Listen("tcp", server.Addr)
    }

    if err != nil {
        log.Fatalf("listener error: %v", err)
    }

    go func() {
        // server.Shutdown() stops Serve() immediately, thus server.Serve() should not be in main goroutine
        err = server.Serve(listener)
        log.Printf("server.Serve err: %v\n", err)
    }()
    signalHandler()
    log.Printf("signal end")
}

func reload() error {
    tl, ok := listener.(*net.TCPListener)
    if !ok {
        return errors.New("listener is not tcp listener")
    }

    f, err := tl.File()
    if err != nil {
        return err
    }

    args := []string{"-graceful"}
    cmd := exec.Command(os.Args[0], args...)
    cmd.Stdout = os.Stdout
    cmd.Stderr = os.Stderr
    // put socket FD at the first entry
    cmd.ExtraFiles = []*os.File{f}
    return cmd.Start()
}

func signalHandler() {
    ch := make(chan os.Signal, 1)
    signal.Notify(ch, syscall.SIGINT, syscall.SIGTERM, syscall.SIGUSR2)
    for {
        sig := <-ch
        log.Printf("signal: %v", sig)

        // timeout context for shutdown
        ctx, _ := context.WithTimeout(context.Background(), 20*time.Second)
        switch sig {
        case syscall.SIGINT, syscall.SIGTERM:
            // stop
            log.Printf("stop")
            signal.Stop(ch)
            server.Shutdown(ctx)
            log.Printf("graceful shutdown")
            return
        case syscall.SIGUSR2:
            // reload
            log.Printf("reload")
            err := reload()
            if err != nil {
                log.Fatalf("graceful restart error: %v", err)
            }
            server.Shutdown(ctx)
            log.Printf("graceful reload")
            return
        }
    }
}

Reference code: https://github.com/CraryPrimitiveMan/go-in-action/tree/master/ch4

systemd & supervisor

After the parent exits, the child process will hang on to No. 1 above process. Supervisord use systemd and other management procedures in this case shows the progress in the failed state. There are two ways to solve this problem:

  • Use pidfile, restart the process of updating it each time pidfile, so that managers perceive the process to change mainpid through this document.
  • From a master to manage the service process, every warm restart master pull up a new process to kill off the old. Then the master pid does not change, the process for process managers in a normal state. A simple realization
Published 158 original articles · won praise 119 · views 810 000 +

Guess you like

Origin blog.csdn.net/u013474436/article/details/104761835