NetApp FAS latest daily inspection command and inspection content description

foreword

Recently, I plan to reorganize the inspection forms stored in NetApp FAS (Lenovo DM). Most of them are 7-mode, and there are not many related to the new version of Ontap and usage scenarios. In fact, most of the inspection content can be used. It is completed in the cluster state. According to my understanding, I will sort out some points that should be paid attention to in the FAS series inspection, as well as some other suggestions.
insert image description here

Inspection Content Suggestions

Before the inspection, it is necessary to clarify some basic information of the NetApp to be inspected. If the user has not compiled relevant documents before, it is recommended to make a preliminary statistics before the first inspection, mainly including:

  • Device model and serial number
  • Admin/Audit Admin Account Information
  • Management address information, including cluster and node management addresses, SP addresses
  • General information about the running business and SVM

Once you have the basic information, you can sort out the inspection content according to the actual situation.

Physical environment inspection

The physical environment inspection includes the basic information of the computer room where the storage and disk cabinets are located, including the temperature and humidity of the computer room, equipment appearance and alarms, etc.

Basic hardware information/basic configuration inspection

Including the inspection of information such as the basic cluster of the device, the network, etc., especially the initial inspection is the key to understanding the information stored in the current one, especially some key information of the network, including the connection rate of the physical port, whether the LIF is on the Home Node etc. are the focus of attention

#检查Ontap版本
::>version
#查看集群状态
::>cluster show
::>cluster ha show
#查看节点基本信息,包括运行时间等
::>system node show
#系统核心硬件及系统的状态检查
::>system health status show
::>system health subsystem show
#SP配置及运行信息确认
::>system service-processor show
#基本网络运行状态确认,包括物理端口及LIF
::>network interface show
::>network port show 
::>network port ifgrp show 

The confirmation of the basic information of the storage capacity, especially the new UI-related capacity display after Ontap 9.5 is very unfriendly, so the confirmation of the AGGR and VOL space in the CLI is very important. If the space is abnormal, go further to see the detailed information, including the preparation method. Temporary comparison files generated during data deduplication, etc.

#AGGR当前状态
::>aggr show
#VOL当前状态
::>vol show
#磁盘柜信息
::>storage shelf show
#磁盘信息
::>storage disk show

Filter and look at the logs of EMERGENCY and ALERT. Logically, these two levels should be empty. If there are, you need to focus on and deal with them.

#收集系统日志
::>event log show -severity EMERGENCY 
::>event log show -severity ALERT 

Although most of the content can be inspected in the cluster mode, there are a few content that are recommended to be viewed in more detail in the node mode. In the absence of information collection tools, sysstat can clearly collect performance data for a period of time.

#节点详细状态收集
(节点)>sysconfig -a
(节点)>sysconfig -r
#节点环境参数确认
(节点)>environment status
#测试当前节点性能,包括带宽及IO等(建议业务高峰期)
(节点)>sysstat -su 1

Inspection at business level

In addition to the inspection scenarios common to all FAS, the other part is to run different inspection commands to view according to the actual usage scenarios and functions

NAS scene

Mainly check whether there is any abnormality in the current mounting and sharing information, including clients with a high number of statistical sessions

#NFS相关信息确认
::>nfs server show
::>export-policy show
::>export-policy rule show
#CIFS相关业务信息确认
::>cifs server show
::>cifs server share show
::>cifs server connection show
#查看当前会话数前十的客户端连接信息
::>statistics top client show

SAN scene

Scenarios including iSCSI and FC lun mounts

#查看当前LUN状态
::>lun show 
::>lun mapping show
#查看FC相关适配器信息及运行状况
::>fcp adapter show
::>fcp initiator show

Snapmirror scene

#查看集群对等方的运行及健康状况
::>cluster peer show
::>cluster peer health show
#查看Snapmirror的运行状况
::>snapmirror show
::>snapmirror show-history

Other Inspection Suggestions

If the user's environment permits, especially when there are multiple NetApps, it is recommended to deploy NetApp's own status and performance monitoring and management tool Netapp Unified Manager. This tool is free (OVA) and powerful. It can collect how much NetApp's historical performance The data includes bandwidth, IO, etc., to judge the current usage of NetApp and provide a reliable basis for future expansion
The picture is searched casually

Inspection form

The following is a self-organized inspection form, which is for reference only, and other content is welcome to add
insert image description here

Guess you like

Origin blog.csdn.net/sjj222sjj/article/details/128865975