How to realize the elegant exit of the program in go, go-kratos source code analysis

I have been using the kratos framework for nearly a year. Recently, I learned about the specific implementation of kratos on the graceful exit of the program.

This part of the logic is in the app.go file. In the main, find the app.Run method and click enter.

It contains the following parts:

  1. App structure: Contains configuration options and runtime state of the application.

  2. New function: Create an App instance.

  3. Run method: Start the application. The main steps include:

    • Build a ServiceInstance registration instance
    • Start Server
    • Register instance to service discovery
    • listen for stop signal
  4. Stop method: Gracefully stops the application. The main steps include:

    • Unregister an instance from service discovery
    • cancel the application context
    • Stop Server
  5. buildInstance method: Build an instance for service discovery registration.

  6. NewContext and FromContext functions: Add AppInfo to Context, so that it can be obtained from Context later.

The core logic flow is:

  1. Create an App instance
  2. Start the Server in App.Run(), register the instance, and monitor the signal
  3. After receiving the stop signal, it will call App.Stop() to stop the application

Let's first check the source code of the Run method

// Run executes all OnStart hooks registered with the application's Lifecycle.
func (a *App) Run() error {
    
    

  // 构建服务发现注册实例
  instance, err := a.buildInstance() 
  if err != nil {
    
    
    return err
  }

  // 保存实例  
  a.mu.Lock()
  a.instance = instance
  a.mu.Unlock()

  // 创建错误组
  eg, ctx := errgroup.WithContext(NewContext(a.ctx, a))

  // 等待组,用于等待Server启动完成
  wg := sync.WaitGroup{
    
    }

  // 启动每个Server
  for _, srv := range a.opts.servers {
    
    
    srv := srv 
    eg.Go(func() error {
    
    
      // 等待停止信号
      <-ctx.Done()  
      // 停止Server
      stopCtx, cancel := context.WithTimeout(a.opts.ctx, a.opts.stopTimeout)
      defer cancel()
      return srv.Stop(stopCtx)
    })

    wg.Add(1)
    eg.Go(func() error {
    
    
      // Server启动完成
      wg.Done() 
      // 启动Server  
      return srv.Start(NewContext(a.opts.ctx, a)) 
    })
  }

  // 等待所有Server启动完成
  wg.Wait()

  // 注册服务实例
  if a.opts.registrar != nil {
    
    
    rctx, rcancel := context.WithTimeout(ctx, a.opts.registrarTimeout)
    defer rcancel()
    if err := a.opts.registrar.Register(rctx, instance); err != nil {
    
    
      return err
    }
  }
  
  // 监听停止信号
  c := make(chan os.Signal, 1)
  signal.Notify(c, a.opts.sigs...)
  eg.Go(func() error {
    
    
    select {
    
    
    case <-ctx.Done():
      return nil
    case <-c:
      // 收到停止信号,停止应用------------- ⬅️注意此时
      return a.Stop() 
    }
  })

  // 等待错误组执行完成
  if err := eg.Wait(); err != nil && !errors.Is(err, context.Canceled) {
    
    
    return err
  }

  return nil
}

The core logic is here ⬇️, use signal.Notify to monitor the stop signal given by the operating system.

  // 监听停止信号
  c := make(chan os.Signal, 1)
  signal.Notify(c, a.opts.sigs...)
  eg.Go(func() error {
    
    
    select {
    
    
    case <-ctx.Done():
      return nil
    case <-c:
      // 收到停止信号,停止应用
      return a.Stop() 
    }
  })

Then call the Stop method, let's look at the source code of Stop

// Stop gracefully stops the application.
func (a *App) Stop() error {
    
    

  // 获取服务实例 
  a.mu.Lock()
  instance := a.instance
  a.mu.Unlock()

  // 从服务发现注销实例
  if a.opts.registrar != nil && instance != nil {
    
    
    ctx, cancel := context.WithTimeout(NewContext(a.ctx, a), a.opts.registrarTimeout)
    defer cancel()
    if err := a.opts.registrar.Deregister(ctx, instance); err != nil {
    
    
      return err
    }
  }

  // 取消应用上下文
  if a.cancel != nil {
    
    
    a.cancel() 
  }

  return nil
}

The main steps are:

1. 获取已经保存的服务实例
2. 如果配置了服务发现,则从服务发现中注销该实例
3. 取消应用上下文来通知应用停止

在Run方法中,我们通过context.WithCancel创建的可取消的上下文Context,在这里通过调用cancel函数来取消该上下文,以通知应用停止。

取消上下文会导致在Run方法中启动的协程全部退出,从而优雅停止应用。

所以Stop方法比较简单,关键是利用了Context来控制应用生命周期。

We can notice that in the Run method, we use a Notify method under the signal package to monitor the shutdown event of the operating system. This is the core of our actions. I have organized this part separately in another article . .

By monitoring the operating system events, we can gracefully stop some tasks that must be completed. If there are some tasks that must be completed, we can use wg := sync.WaitGroup{} to perform an Add operation on the task at the beginning of the task , when all the tasks are completed and the closing action of the operating system is monitored, we need to use wg.wait() to wait for the task to complete before exiting. In order to achieve a graceful start and stop.

Guess you like

Origin blog.csdn.net/w_monster/article/details/131994339