https://my.oschina.net/adailinux/blog/2231519
Before writing an article describes how to replace the line of server disk operation process , the whole machine was not replaced the entire disk, but recently another part of the machine disk corruption, raid type 10, after testing, only need to replace broken disk to supplement the documents below.
Installation MegaCLI
Installation packages Download .
Installation process
# 首先下载获取安装包
# 解压
$ tar -zxf MegaCli8.07.10.tar.gz
$ cd MegaCli8.07.10/Linux/
$ rpm -ivh Lib_Utils-1.00-09.noarch.rpm MegaCli-8.02.21-1.noarch.rpm
# 加入系统环境
$ ln -s /opt/MegaRAID/MegaCli/MegaCli64 /usr/local/bin/MegaCli $ MegaCli -v MegaCLI SAS RAID Management Tool Ver 8.02.21 Oct 21, 2011 (c)Copyright 2011, LSI Corporation, All Rights Reserved. Exit Code: 0x00 # 安装完成!
-
Conflict management:
$ rpm -ivh Lib_Utils-1.00-09.noarch.rpm MegaCli-8.02.21-1.noarch.rpm 准备中... ################################# [100%] file /opt/lsi/3rdpartylibs/x86_64/libsysfs.so.2.0.2 from install of Lib_Utils-1.00-09.noarch conflicts with file from package srvadmin-storelib-sysfs-9.1.0-2757.12163.el7.x86_64
-
The reason: Lib_Utils and Dell server that comes with the package srvadmin conflict directly uninstall it, then install it.
rpm -e srvadmin-storelib-sysfs-9.1.0-2757.12163.el7.x86_64 --nodeps
user's guidance
Basic Usage
# 查raid级别
$ megacli -LDInfo -Lall -aALL
# 查raid卡信息
$ megacli -AdpAllInfo -aALL
# 查看硬盘信息
$ megacli -PDList -aALL
# 查看电池信息
$ megacli -AdpBbuCmd -aAll
# 查看raid卡日志 $ megacli -FwTermLog -Dsply -aALL # 显示适配器个数 $ megacli -adpCount # 显示适配器时间 $ megacli -AdpGetTime –aALL # 显示所有适配器信息 $ megacli -AdpAllInfo -aAll # 显示所有逻辑磁盘组信息 $ megacli -LDInfo -LALL -aAll # 显示所有的物理信息 $ megacli -PDList -aAll # 查看充电状态 $ megacli -AdpBbuCmd -GetBbuStatus -aALL |grep 'Charger Status' # 显示BBU状态信息 $ megacli -AdpBbuCmd -GetBbuStatus -aALL # 显示BBU容量信息 $ megacli -AdpBbuCmd -GetBbuCapacityInfo -aALL # 显示BBU设计参数 $ megacli -AdpBbuCmd -GetBbuDesignInfo -aALL # 显示当前BBU属性 $ megacli -AdpBbuCmd -GetBbuProperties -aALL # 显示Raid卡型号,Raid设置,Disk相关信息 $ megacli -cfgdsply -aALL ## 磁带状态的变化,从拔盘,到插盘的过程中。 Device |Normal |Damage |Rebuild |Normal Virtual Drive |Optimal|Degraded|Degraded|Optimal Physical Drive |Online |Failed Unconfigured|Rebuild|Online # 查看物理磁盘状态: $ megacli -PDRbld -ShowProg -PhysDrv [Enclosure Device ID:Slot Number] -a0 ## Rebuild 中的物理磁盘状态中会显示:"Firmware state: Rebuild" # 查询 Rebuild 进度: $ megacli -pdrbld -showprog -physdrv[E:S] -aALL ## 返回内容类似于下面这样: Rebuild Progress on Device at Enclosure 32, Slot 5 Completed 77% in 101 Minutes. # 以文本进度条样式显示 Rebuild 进度: $ megacli -pdrbld -progdsply -physdrv[E:S] -aALL ## 屏幕显示类似下面的内容: Rebuild progress of physical drives... Enclosure:Slot Percent Complete Time Elps 032 :05 #######################87 %################******* 01:59:07 Press key to quit... # 查看 RAID 卡 Rebuild 参数: $ megacli -AdpAllinfo -aALL | grep -i rebuild ## 返回结果类似下面这样 Rebuild Rate : 30% Auto Rebuild : Enabled Rebuild Rate : YesForce Rebuild : Yes # 设置 RAID 卡 Rebuild 比例为60%: $ megacli -AdpSetProp { RebuildRate -60} -aALL ## 设置成功后返回: Adapter 0: Set rebuild rate to 60% success.
MegaCLI Usage: http://blog.51cto.com/daixuan/1863567
Important parameters
parameter name | meaning |
---|---|
Firmware state | Disk Status |
Firmware state: Online, Spun Up | Disk normal |
Firmware state: Unconfigured(good), Spun Up | Disk is installed, but not enabled |
Firmware state: Unconfigured(bad) | Failure, corresponding to the Non-Critical hwcheck |
Firmware state: Failed | Failure of the corresponding Critical hwcheck |
Firmware state: Rebuild | Reconstruction, usually displayed when replacing a disk |
Enclosure Device ID: 32 | device |
Slot Number: 1 | Disk slots on the server |
Adapter #0 | Adapter number, corresponding to the parameter -a |
Combat: Replace the hard environment under raid10
Under Raid10 environment swap hard disk is very simple, hot swappable, just remove replace it, here are the steps.
The main environmental
Server: R720
System: CentOS7
raid type: raid10
View hard drive information
To more clearly rendering operation, not to simplify the processing of information.
$ MegaCli -PDList -aAll -NoLog
Adapter #0
Enclosure Device ID: 32
Slot Number: 0
Drive's postion: DiskGroup: 0, Span: 0, Arm: 0
Enclosure position: 0
Device Id: 0
WWN: 5000C50076CD09B4
Sequence Number: 1
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 28
Last Predictive Failure Event Seq Number: 4378
PD Type: SAS
Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
Coerced Size: 558.375 GB [0x45cc0000 Sectors]
Firmware state: Unconfigured(good), Spun Up
Device Firmware Level: ES66
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x5000c50076cd09b5
SAS Address(1): 0x0
Connected Port Number: 5(path0)
Inquiry Data: SEAGATE ST3600057SS ES666SL8SASQ
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: Foreign
Foreign Secure: Drive is not secured by a foreign lock key
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive Temperature :40C (104.00 F)
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Drive's write cache : Disabled
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Port-1 :
Port status: Active
Port's Linkspeed: Unknown
Drive has flagged a S.M.A.R.T alert : Yes
Enclosure Device ID: 32
Slot Number: 2
Enclosure position: 0
Device Id: 2
WWN: 5000C50076CD05BC
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS
Raw Size: 0 KB [0x0 Sectors]
Non Coerced Size: 0 KB [0x0 Sectors]
Coerced Size: 0 KB [0x0 Sectors]
Firmware state: Unconfigured(bad)
Device Firmware Level: ES66
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x5000c50076cd05bd
SAS Address(1): 0x0
Connected Port Number: 1(path0)
Inquiry Data: SEAGATE ST3600057SS ES666SL8SAVC
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: Unknown
Link Speed: Unknown
Media Type: Hard Disk Device
Drive: Not Supported
Drive Temperature :0C (32.00 F)
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Drive's write cache : Disabled Port-0 : Port status: Active Port's Linkspeed: Unknown Port-1 : Port status: Active Port's Linkspeed: Unknown Drive has flagged a S.M.A.R.T alert : No Enclosure Device ID: 32 Slot Number: 1 Drive's postion: DiskGroup: 0, Span: 0, Arm: 1 Enclosure position: 0 Device Id: 1 WWN: 5000C500983873BC Sequence Number: 2 Media Error Count: 0 Other Error Count: 0 Predictive Failure Count: 0 Last Predictive Failure Event Seq Number: 0 PD Type: SAS Raw Size: 558.911 GB [0x45dd2fb0 Sectors] Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors] Coerced Size: 558.375 GB [0x45cc0000 Sectors] Firmware state: Online, Spun Up Device Firmware Level: VT31 Shield Counter: 0 Successful diagnostics completion on : N/A SAS Address(0): 0x5000c500983873bd SAS Address(1): 0x0 Connected Port Number: 3(path0) Inquiry Data: SEAGATE ST600MP0005 VT31S7M1CSLT FDE Enable: Disable Secured: Unsecured Locked: Unlocked Needs EKM Attention: No Foreign State: None Device Speed: Unknown Link Speed: 6.0Gb/s Media Type: Hard Disk Device Drive Temperature :41C (105.80 F) PI Eligibility: No Drive is formatted for PI information: No PI: No PI Drive's write cache : Disabled Port-0 : Port status: Active Port's Linkspeed: 6.0Gb/s Port-1 : Port status: Active Port's Linkspeed: Unknown Drive has flagged a S.M.A.R.T alert : No Enclosure Device ID: 32 Slot Number: 3 Drive's postion: DiskGroup: 0, Span: 1, Arm: 1 Enclosure position: 0 Device Id: 3 WWN: 5000C50076CE2F30 Sequence Number: 2 Media Error Count: 5 Other Error Count: 71 Predictive Failure Count: 15 Last Predictive Failure Event Seq Number: 4379 PD Type: SAS Raw Size: 558.911 GB [0x45dd2fb0 Sectors] Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors] Coerced Size: 558.375 GB [0x45cc0000 Sectors] Firmware state: Online, Spun Up Device Firmware Level: ES66 Shield Counter: 0 Successful diagnostics completion on : N/A SAS Address(0): 0x5000c50076ce2f31 SAS Address(1): 0x0 Connected Port Number: 2(path0) Inquiry Data: SEAGATE ST3600057SS ES666SL8SAKA FDE Enable: Disable Secured: Unsecured Locked: Unlocked Needs EKM Attention: No Foreign State: None Device Speed: 6.0Gb/s Link Speed: 6.0Gb/s Media Type: Hard Disk Device Drive Temperature :48C (118.40 F) PI Eligibility: No Drive is formatted for PI information: No PI: No PI Drive's write cache : Disabled Port-0 : Port status: Active Port's Linkspeed: 6.0Gb/s Port-1 : Port status: Active Port's Linkspeed: Unknown Drive has flagged a S.M.A.R.T alert : Yes Enclosure Device ID: 32 Slot Number: 4 Drive's postion: DiskGroup: 1, Span: 0, Arm: 0 Enclosure position: 0 Device Id: 4 WWN: 5000C5007E70F0F8 Sequence Number: 2 Media Error Count: 0 Other Error Count: 0 Predictive Failure Count: 0 Last Predictive Failure Event Seq Number: 0 PD Type: SAS Raw Size: 558.911 GB [0x45dd2fb0 Sectors] Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors] Coerced Size: 558.375 GB [0x45cc0000 Sectors] Firmware state: Online, Spun Up Device Firmware Level: ES66 Shield Counter: 0 Successful diagnostics completion on : N/A SAS Address(0): 0x5000c5007e70f0f9 SAS Address(1): 0x0 Connected Port Number: 0(path0) Inquiry Data: SEAGATE ST3600057SS ES666SL9F1JB FDE Enable: Disable Secured: Unsecured Locked: Unlocked Needs EKM Attention: No Foreign State: None Device Speed: 6.0Gb/s Link Speed: 6.0Gb/s Media Type: Hard Disk Device Drive Temperature :46C (114.80 F) PI Eligibility: No Drive is formatted for PI information: No PI: No PI Drive's write cache : Disabled Port-0 : Port status: Active Port's Linkspeed: 6.0Gb/s Port-1 : Port status: Active Port's Linkspeed: Unknown Drive has flagged a S.M.A.R.T alert : No Enclosure Device ID: 32 Slot Number: 5 Drive's postion: DiskGroup: 1, Span: 0, Arm: 1 Enclosure position: 0 Device Id: 5 WWN: 5000C5007E708E3C Sequence Number: 2 Media Error Count: 0 Other Error Count: 0 Predictive Failure Count: 0 Last Predictive Failure Event Seq Number: 0 PD Type: SAS Raw Size: 558.911 GB [0x45dd2fb0 Sectors] Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors] Coerced Size: 558.375 GB [0x45cc0000 Sectors] Firmware state: Online, Spun Up Device Firmware Level: ES66 Shield Counter: 0 Successful diagnostics completion on : N/A SAS Address(0): 0x5000c5007e708e3d SAS Address(1): 0x0 Connected Port Number: 4(path0) Inquiry Data: SEAGATE ST3600057SS ES666SL9F2RB FDE Enable: Disable Secured: Unsecured Locked: Unlocked Needs EKM Attention: No Foreign State: None Device Speed: 6.0Gb/s Link Speed: 6.0Gb/s Media Type: Hard Disk Device Drive Temperature :45C (113.00 F) PI Eligibility: No Drive is formatted for PI information: No PI: No PI Drive's write cache : Disabled Port-0 : Port status: Active Port's Linkspeed: 6.0Gb/s Port-1 : Port status: Active Port's Linkspeed: Unknown Drive has flagged a S.M.A.R.T alert : No Exit Code: 0x00
Apparent from the above information that the server has six disks (Device Id).
Uninstall failed hard
$ MegaCli -PDOffline -PhysDrv[32:2] -a0
$ MegaCli -PDOffline -PhysDrv[32:0] -a0
32 and 2 and a correspondence relationship -a0 above command:
Adapter #0
Enclosure Device ID: 32
Slot Number: 2
Replace a failed hard drive
At this point the failed hard disk already OFFLINE, when you view the site server, hard disk failure is flashing yellow light, green light normal hard drive; unplug the failed hard disk, plug in a good hard drive, the hard disk light flashes green, and rapidly rotating hard disk, the hard disk is expressed rebuild the state, view the status as follows:
$ MegaCli -PDList -aAll -NoLog
...
Enclosure Device ID: 32
Slot Number: 3
...
Firmware state: Rebuild
...
Check rebuild progress
$ MegaCli -PDRbld -ShowProg -PhysDrv[32:2] -aAll
Rebuild Progress on Device at Enclosure 32, Slot 3 Completed 16% in 94 Minutes.
Disk replacement completed
$ MegaCli -PDList -aAll -NoLog | grep 'Firmware state'
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up