ESXi is very simple - to talk about those things ESXi multipathing

https://mp.weixin.qq.com/s/7GgHPbX0dkMGww3TkClYvw

Today we do not talk vSAN, it says something about the traditional architecture of ESXi multipathing software. Because a similar problem recently asked several times, so I feel the need to write something should be recorded.

    This section is more, I have some understanding not in place, if there are questions please work together to explore.

 

This article is divided into several parts, you can select the content they need:

 

  • What is PSA

  • What is NMP

  • What is SATP

  • What is PSP

  • Of NMP I / O workflow

  • VMware supports multi-path strategy

  • What is A / PA / A and ALUA

  • RR + ALUA used in conjunction with

  • Path switching condition occurs

 

What is PSA

=====================

   ESXi PSA Chart

    

    ESXi using PSA (Pluggable Storage Architecture) to manage the different multipath plug (MMP: multiple multipathing plug-ins). PSA is an open architecture, third-party vendors can develop a different multi-path management software (such as PowerPath) and added to this framework. As a user ESXi, either use third-party multipathing software, you can use multipathing software provided by VMware.

 

What is NMP

 

=====================

    ESXi comes MMP is called NMP (Native Multipathing Plug-In) . NMP itself is a management subplugins expansion module , and for coordinating the storage device using a host connection and different physical paths.

 

Daughterboard NMP management involves two types:

  • SATP:Storage Array Type Plug-Ins

  • PSP:Path Selection Plug-Ins

Note: SATP and PSP can either be provided by VMware, or may be provided by third parties.

 

What is SATP

=====================

SATP is used to manage the paths for a specific array type   

  • NMP determines which type of memory for use SATP and SATP used to manage the physical path of the storage device.

  • NMP ensure that each storage device is assigned a specified or default SATP.

  • Each characteristic corresponds SATP contains a series of stored, so SATP some operations may be performed directly on the memory.

  • NMP SATP can manage and use multiple simultaneously.

 

SATP work includes:

 

  • Detecting physical link status

  • Change the physical link status report

  • Operation when the memory necessary failover, for example, A / P memory (standby mode) to activate the standby path when a failure occurs.

 

VMware includes default SATP are:

  • VMW_SATP_LOCAL

  • VMW_SATP_DEFAULT_AA

  • VMW_SATP_DEFAULT_AP

  • VMW_SATP_ALUA

 

What is PSP

=====================

PSP is used to choose which path should be used for each storage device.

 

PSP work includes:

  • PSP is responsible for selecting a physical link to be used for data transmission

  • NMP assign the storage device in accordance with a PSP SATP.

     

Run the following commands listed default SATP PSP available on the host and the corresponding

esxcli storage nmp satp list

 

ESXi comes with the PSP include:

  • VMW_PSP_MRU

  • VMW_PSP_FIXED

  • VMW_PSP_RR

 

The following is a LUN of the NMP command output:

localcli_storage-nmp-device-list command output

 

Of NMP I / O workflow

=====================

  1. NMP call this storage device is assigned to the PSP

  2. PSP according to the selected multi-path strategy to choose the right path.

  3. NMP IO requests initiated by the selected path PSP.

  4. If the IO is finished, the NMP marked as completed

  5. If the IO is not executed successfully, NMP SATP assigned to the storage device will be called before

  6. SATP interpretation IO error returned, if necessary, activate other paths.

  7. PSP suitable route reselection transmission path from IO activity.

 

simply put:

  • SATP is responsible for determining which I / O paths available.

  • SATP负责监视更改和处理故障转移。

  • SATP用于确定哪些路径可用,

  • PSP选择要使用的可用路径。

 

ESXi支持的多路径策略

=========================

  1. Fixed Path(FIXED)

  2. Most Recently Used (MRU)

  3. Round Robin (RR)

 

Fixed Path:

=========================

    主机使用指定的活动路径(Preferred path)或者在启动的时候选择第一条可以工作的活动路径来发送IO指令。当指定的活动路径发生故障时,ESXi主机会切换到另外的可用路径。如果此时原来的活动路径恢复的话,ESXi 会切换回原来的路径。

注意:If the host uses a default preferred path and the path's status turns to Dead, a new path is selected as preferred. However, if you explicitly designate the preferred path, it will remain preferred even when it becomes inaccessible.

使用Fixed策略的场景

    主机默认使用右侧的交换机连接到存储的右侧控制器(左图),当这个控制器发生问题时,主机会切换到另外的路径(控制器)(中间图)。当控制器恢复正常后,主机会切换回原来的路径(右图)

 

Most Recently Used

=========================

    MRU里没有(Preferred path)的设置。MRU和Fixed区别就是当故障恢复后,ESXi不会切换回原来的路径,而是继续使用当前的路径。

使用MRU策略的场景:

    主机默认使用右侧的交换机连接到存储的右侧控制器(左图),当这个控制器发生问题时,主机会切换到另外的路径(控制器)(中间图)。当控制器恢复正常后,主机不会切换回原来的路径,继续使用当前正在使用的路径(右图)

 

Round Robin

=========================

    当前RR已经是许多存储的默认存储策略。

    RR方式是根据算法自动把IO分布在所有的可用路径(A/P存储)或者所有路径(A/A存储)上。当一个路径上的达到指定的传输量后,切换到另外一条路径。主要包括:

  • IOPS限制:默认。到达1000个IOPS后切换到另外的路径

  • Bytes限制:指定达到多少Bytes传输量后切换到另外的路径。

这两个限制都可以手工修改,具体参考KB:

Adjusting Round Robin IOPS limit from default 1000 to 1 (2069356)

使用RR策略的场景:

    还是上面的案例,主机到存储LUN有多条路径,RR轮循使用不同的路径达到负载均衡的效果。

 

什么是A/A、A/P 和ALUA

=========================

在上文说SATP时候,我们说到VMware自带的SATP包括下面四种:

  • VMW_SATP_LOCAL

  • VMW_SATP_DEFAULT_AA

  • VMW_SATP_DEFAULT_AP

  • VMW_SATP_ALUA

其中

AA表示Active-Active存储

AP表示Active-Passtive存储

上面两个都很好理解,那么ALUA表示什么呢?(Asymmetric Logical Unit Access)

    使用ALUA,主机可以同时通过多个控制器访问同一个LUN。例如一个存储有两个控制器(A和B)两个控制器对外都是Active的状态,但是一个LUN的owner只有一个控制器(例如控制器A),因此如果用户通过控制器A的话,可以直接访问到LUN;通过控制器B的话,则在存储内部进行传输,最终还是通过控制器A访问到LUN。那么直接通过控制器A访问的路径就叫做“优化路径”,通过控制器B上访问的路径就叫做“非优化路径”,这种模式就叫做非对称逻辑单元访问(ALUA)

 

在上面的示例中,LUN的Ower是左侧的控制器

结合示意图我们再说明下,在左图中,我们可以看到存储路径没有被优化:我们找到了一条通往LUN的完全可接受的路径,但当我们到达那里时,发现左侧存储控制器目前拥有这个LUN。此时,虽然右侧存储控制器仍然可以访问LUN,但必须经过左侧存储控制器过程,因此这条路径没有进行优化。ALUA确保尽可使用最优化的路径。

 

RR+ALUA

=========================

    下面的图展示了RR和ALUA搭配使用的场景:

    左侧的存储控制器是LUN的Owner,我们有两条优化路径可以达到这个控制器,因此RR会轮循使用这两个路径来访问LUN。

 

什么时候会进行路径切换?

=========================

    经常有人会问我为什么发生问题时候主机没有进行路径切换?借着今天的机会我也说明下吧。简单来说,ESXi主机只有在收到特定的SCSI Code时才会进行路径切换。

    我们都知道在主机发送IO指令到存储后,如果发生异常的话,HBA,存储等设备会返回特定的SCSI Code。例如:

2017-07-20T22:28:58.155Z cpu1:33548)ScsiDeviceIO: 2652: Cmd(0x43b72ec112c0) 0x28, CmdSN 0x388202 from world 0 to dev "naa.5000039698194c41" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x3 0x11 0x1.

    根据SCSI Code,我们可以查询到故障的原因是什么。有大神开发了工具可以方便快速的decode:

http://www.virten.net/vmware/esxi-scsi-sense-code-decoder/

 

    需要说明的是SCSI Code是一个国际行业的标准,并不是VMware自己的标准。具体信息可以参考下面的网站:

http://www.t10.org/lists/1spc-lst.htm

 

    回到路径切换的话题上,ESXi只有在收到特定的SCSI Code时才会进行路径切换。例如:

  • H:0x1 D:0x0 P:0x0 Valid sense data: 0x0 0x0 0x0  DID_NO_CONNECT

  • H:0x0 D:0x2 P:0x0 Valid sense data: 0x3 0x4 0x3 - MEDIUM ERROR - LOGICAL UNIT NOT READY

    因此如果发生问题时候没有发生路径切换,则需要检查发生问题时返回的SCSI Code是什么。

 

具体参考这两篇KB:

SCSI events that can trigger ESX server to fail a LUN over to another path (1003433)

Understanding SCSI host-side NMP errors/conditions in ESX/ESXi 4.x, ESXi 5.x, and 6.x (1029039)

 

以上是这次的更新,欢迎大家一起讨论。

 

参考文档:

https://docs.vmware.com/en/VMware-vSphere/6.5/com.vmware.vsphere.storage.doc/GUID-9DED1F73-7375-4957-BF69-41B56C3E5224.html

https://deinoscloud.wordpress.com/2011/02/28/vmware-psa-mpp-nmp-psp-mru-and-tutti-quanti/

https://www.vmadmin.co.uk/resources/35-esxserver/364-managing-nmp-changing-the-default-psp-for-a-satp-change-from-fixed-to-round-robin

https://theithollow.com/2012/03/08/path-selection-policy-with-alua/

https://blog.fosketts.net/2011/06/06/vmware-esx-vsphere-satp-psp-support-matrix/

https://kb.vmware.com/s/article/2069356

https://www.virten.net/vmware/vmware-esxi-scsi-sense-code-decoder-v2/

https://blogs.vmware.com/vsphere/2012/02/configuration-settings-for-alua-devices.html

http://www.yellow-bricks.com/2009/09/29/whats-that-alua-exactly/

https://virtualgeek.typepad.com/virtual_geek/2009/09/a-couple-important-alua-and-srm-notes.html

发布了3 篇原创文章 · 获赞 20 · 访问量 4万+

Guess you like

Origin blog.csdn.net/z136370204/article/details/104064428