Selected White Papers from Dragon Lizard: SysAK—a system operation and maintenance tool for large-scale and complex scenarios

Text/System Operation and Maintenance SIG

01Overview  _

SysAK (System Analyse Kit) is the system operation and maintenance SIG of the Dragon Lizard Community. It provides a comprehensive system operation and maintenance tool set by abstractly summarizing the operation and maintenance experience of millions of servers in the past, which can cover the daily monitoring of the system, online Common operation and maintenance scenarios such as problem diagnosis and system fault repair. The overall design of the tool strives to make operation and maintenance work simple, so that system operation and maintenance personnel can find the problem without having to understand the kernel in depth.

02Technical  solution 

SysAK will provide comprehensive coverage in the feature set and vertically connect the entire application life cycle. The current tool supports both monitoring and diagnostic modes. In monitoring mode, SysAK resides in the background to provide various system indicators for operation and maintenance personnel. The diagnostic mode is enabled at any time and is mainly used to analyze system phenomenon diagnosis and program control under different operation and maintenance scenarios. Its overall function is shown in the figure below:

picture

SysAK is not limited to a tool set. In addition to providing the system operation and maintenance tools themselves, it also designs and implements a tool development framework. And through loose coupling, dependency management, multi-architecture and multi-version build support, etc., it ensures that tool developers can develop once and integrate it on mainstream architectures and operating system versions without additional work. Its overall structure is shown in the figure below:

picture

03Application  scenarios 

The diagnostic tools provided by SysAK can meet the operation and maintenance needs of different application scenarios:

  • Daily monitoring: More refined resource monitoring for various system resources, helping business operations and maintenance achieve fine-grained operation and maintenance scheduling and resource control. In addition, many enhanced system indicators have been implemented to monitor system interference and jitter in real time.

  • Problem diagnosis: Provide online diagnosis functions for load abnormalities, network jitters, memory leaks, IO congestion, performance abnormalities, etc. At the same time, the professionalism of the tool is reduced and the operability is strong.

  • Fault repair: For non-machine-abnormal problems (such as deadlocks, crashes, etc.), this tool provides intervention capabilities to restore the system or isolate faults.

For more selected content from the Dragon Lizard White Paper, click here to view.

Related Links:

System Operation and Maintenance SIG Home Page:

https://openanolis.cn/sig/sysom

For more analysis of dragon lizard technical characteristics, please visit "Dragon Lizard Characteristics Encyclopedia":

https://anolis.gitee.io/anolis_features/

2022 Panoramic White Paper of the Dragon Lizard Community (or obtain it by replying to the keyword "White Paper" on the public account [OpenAnolis Dragon Lizard])

https://openanolis.cn/openanoliswhitepaper

-- over--

Guess you like

Origin blog.csdn.net/weixin_60347558/article/details/132567100