【vbers】ibv_get_async_event()

原文:ibv_get_async_event() - RDMAmojo RDMAmojo

描述


ibv_get_async_event() 读取 RDMA 设备上下文context的下一个异步事件。
在调用 ibv_open_device() 之后,所有异步事件都被加入到这个上下文中,并且调用 ibv_get_async_event() 将按照它们的顺序一个一个地读取它们。即使 ibv_get_async_event() 在事件生成后很长时间被调用,它仍然会首先读取较旧的事件。不幸的是,事件没有任何时间概念,用户无法知道事件何时发生。


默认情况下,ibv_get_async_event() 是一个阻塞函数,如果没有任何异步事件要读取,它会等到下一个事件生成。拥有一个等待下一个事件发生的专用线程会更好。但是,如果希望以非阻塞方式读取事件,则可以这样做。可以使用 fcntl() 将设备上下文中事件文件的文件描述符配置为非阻塞,然后使用 read()/poll()/epoll()/select() 读取此文件描述符,以便确定是否有等待读取的事件。在这篇文章中有一个关于如何做的例子。


调用 ibv_get_async_event() 是原子的,即使它在多个线程中调用,也可以保证同一事件不会被多个线程读取。
使用 ibv_get_async_event() 接收到的每个事件都必须使用 ibv_ack_async_event() 进行确认。
这是结构 ibv_async_event 的完整描述:

Name Description
element A union of several fields that only one of them is valid, depends on the event type:

CQ events: element.cq is valid

QP events: element.qp is valid

SRQ events: element.srq is valid

Port events: element.port_num is valid

RDMA device events: no field is valid

event_type Enumerated value which described the type of the event

Here is a full description of the possible events:

QP events

Here is the description of the affiliated events that may occur for QPs. For those events, the field event->element.qp contains the handle of the QP that got this asynchronous event. Those events will be generated only in the context of the code that this QP belongs to.

IBV_EVENT_COMM_EST

A QP which its state is IBV_QPS_RTR received the first packet in its Receive Queue and it was processed without any error.

This event is mainly relevant only in connection oriented QPs, i.e. RC and UC QPs. It may happen for UD QP as well, it is driver implementation specific.

IBV_EVENT_SQ_DRAINED

A QP, which its state was changed from IBV_QPS_RTS to IBV_QPS_SQD, completed sending all of the outstanding messages in progress in its Send Queue when the state change was requested. For RC QP, this means that all of those messages received acknowledgments, if applicable.

Most of the time, this event will be generated when the (internal) QP state will be changed from SQD.draining to SQD.drained. However, this event may be also generated if the transition to the state IBV_QPS_SQD was aborted because of a transition (either by the RDMA device or by the user) into the  IBV_QPS_SQEIBV_QPS_ERR or IBV_QPS_RESET QP states.

After this event, and the QP is in the IBV_QPS_SQD state it is safe to the user to start modifying the Send Queue attributes send there aren't any message send in progress.

IBV_EVENT_PATH_MIG

Indicates the connection has migrated to the alternate path. This event is relevant only to connection oriented QPs, i.e. RC and UC QPs.

This means that the alternate path attributes are now being used as the primary path attributes. If it is required that there will be another alternate path attribute loaded, the user can now set those attributes.

IBV_EVENT_QP_LAST_WQE_REACHED

A QP, which is associated with an SRQ, was transitioned to the IBV_QPS_ERR state, either automatically by the RDMA device or explicitly by the user, and one of the following occurred:

  • A completion with error was generated for the last WQE
  • The QP transitioned to the IBV_QPS_ERR state and there are no more WQEs on Receive Queue of that QP

This event actually means that WQEs won't be consumed anymore from the SRQ by this QP.

If there was an error to a QP and this event wasn't generated, the user must destroy all of the QPs that are associated with this SRQ and the SRQ itself in order to reclaim all of the WQEs associated with that QP.

IBV_EVENT_QP_FATAL

A QP experienced an error that prevents the generation of completions while accessing or processing the Work Queue, either Send or Receive Queue.

If the problem that caused this event is in the CQ of that Work Queue, the appropriate CQ will get the IBV_EVENT_CQ_ERR event too.

IBV_EVENT_QP_REQ_ERR

The transport layer of the RDMA device detected a transport error violation in the responder side. This error may be one of the following:

  • Unsupported or reserved opcode
  • Out of sequence opcode

Those errors are rare and may happen when there are problems in the subnet or when an RDMA device sends illegal packets.

When this happens, the QP is being transitioned automatically to the IBV_QPS_ERR state by the RDMA device.

This event is relevant only to RC QPs.

IBV_EVENT_QP_ACCESS_ERR

The transport layer of the RDMA device detected a request error violation in the responder side. This error may be one of the following:

  • Misaligned atomic request
  • Too many RDMA Read or Atomic requests
  • R_Key violation
  • Length errors without immediate data

Those errors are usually happening due to bugs in the user code.

When this happens, the QP is being transitioned automatically to the IBV_QPS_ERR state by the RDMA device.

This event is relevant only to RC QPs.

IBV_EVENT_PATH_MIG_ERR

A QP that has an alternate path attributes loaded tried to perform a path migration change, either by the RDMA device or explicitly by the user, and there was an error that prevented from moving to that alternate path.

This error usually can happen if the alternate path attributes in both sides aren't consistent.

CQ events

Here is the description of the affiliated events that may occur for CQs. For those events, the field event->element.cq contains the handle of the CQ that got this asynchronous event. Those events will be generated only in the context of the code that this CQ belongs to.

IBV_EVENT_CQ_ERR

An error occurred when writing a completion to the CQ. This event may occur when there is a protection error (a rare condition) or when there is a CQ overrun (most likely)

When the CQ has an error, it isn't guaranteed that completions from that CQ can be pulled. All of the QPs that are associated with this CQ, either in their RQ or in their SQ will get the IBV_EVENT_QP_FATAL event too.

SRQ events

Here is the description of the affiliated events that may occur for SRQs. For those events, the field event->element.srq contains the handle of the SRQ that got this asynchronous event. Those events will be generated only in the context of the code that this SRQ belongs to.

IBV_EVENT_SRQ_LIMIT_REACHED

A SRQ which was armed and the number of RR in that SRQ dropped below the limit value of that SRQ. When this event is being generated, the limit value of the SRQ will be set to zero.

Most likely that when this event happens, the user will post more RRs to that SRQ and rearm the SRQ again.

IBV_EVENT_SRQ_ERR

An error occurred that prevents from the RDMA device from dequeuing RRs from that SRQ and reporting of receive completions.

If an SRQ experience an error, all of the QPs, which are associated with this SRQ, will be transitioned to IBV_QPS_ERR state and the IBV_EVENT_QP_FATAL asynchronous event will be generated for them.

Port events

Here is the description of the unaffiliated events that may occur for RDMA device ports. For those events, the field event->element.port_num contains the number of the port that got this asynchronous event. Those events will be generated for all of the contexts that use the RDMA device that its port got the events.

IBV_EVENT_PORT_ACTIVE

The link becomes active and it now available to send/receive packets.

The port_attr.state is was in one of the following states: IBV_PORT_DOWNIBV_PORT_INITIBV_PORT_ARMED and it moved to one of the following states IBV_PORT_ACTIVE or IBV_PORT_ACTIVE_DEFER. This can happen when the SM configures the port.

This event will be generated by the device only if IBV_DEVICE_PORT_ACTIVE_EVENT is set in dev_cap.device_cap_flags.

IBV_EVENT_LID_CHANGE

LID was changed on a port by the SM. If this is not the first time that the SM configures the port LID, this may indicate that there is a new SM in the subnet, or the SM reconfigures the subnet. QPs which send/receive data may experience connection failures (if the LIDs in the subnet were changed).

IBV_EVENT_PKEY_CHANGE

P_Key table was changed on a port by the SM. Since QPs are using P_Key table indexes rather than absolute values, it is suggested for the client to check that the P_Key indexes which his QPs use weren't changed.

IBV_EVENT_GID_CHANGE

GID table was changed on a port by the SM. Since QPs are using GID table indexes rather than absolute values (as the source GID), it is suggested for the client to check that the GID indexes which his QPs use weren't changed.

IBV_EVENT_SM_CHANGE

There is a new SM in the subnet which port belongs to and the client should reregister to all subscriptions previously requested from this port, for example (but not limited to) join a multicast group.

IBV_EVENT_CLIENT_REREGISTER

The SM requests that the client will reregister to all subscriptions previously requested from this port, for example (but not limited to) join a multicast group. This event may be generated when the SM suffered from a failure, which caused it to lose his records or when there is new SM in the subnet.

This event will be generated by the device only if the bit that indicates that client reregister is supported set in port_attr.port_cap_flags.

IBV_EVENT_PORT_ERR

The link becomes inactive and it now unavailable to send/receive packets.

The port_attr.state is was in either IBV_PORT_ACTIVE or IBV_PORT_ACTIVE_DEFER states and it moved to one of the following states: IBV_PORT_DOWNIBV_PORT_INITIBV_PORT_ARMED. This can happen when the there are problems with the link (for example: the cable was removed).

This will not affect the QPs, which are associated with this port, states. Although if they are reliable and tries to send data, they may experience retry exceeded.

Device events

Here are the unaffiliated events that may occur in RDMA devices. Those events will be generated for all of the contexts that use the RDMA device that got the events.

IBV_EVENT_DEVICE_FATAL

The RDMA device suffered from an error which isn't related to one of the above asynchronous events. When this event occurs, the behavior of the RDMA device isn't determined and it is highly recommended to close the process immediately since the attempt to destroy the RDMA resources may fail.

概要

下表总结了异步事件的行为:

Event name Element type Event type Protocol
IBV_EVENT_COMM_EST QP Info IB, RoCE
IBV_EVENT_SQ_DRAINED QP Info IB, RoCE
IBV_EVENT_PATH_MIG QP Info IB, RoCE
IBV_EVENT_QP_LAST_WQE_REACHED QP Info IB, RoCE
IBV_EVENT_QP_FATAL QP Error IB, RoCE, iWARP
IBV_EVENT_QP_REQ_ERR QP Error IB, RoCE, iWARP
IBV_EVENT_QP_ACCESS_ERR QP Error IB, RoCE, iWARP
IBV_EVENT_PATH_MIG_ERR QP Error IB, RoCE
IBV_EVENT_CQ_ERR CQ Error IB, RoCE, iWARP
IBV_EVENT_SRQ_LIMIT_REACHED SRQ Info IB, RoCE, iWARP
IBV_EVENT_SRQ_ERR SRQ Error IB, RoCE, iWARP
IBV_EVENT_PORT_ACTIVE Port Info IB, RoCE, iWARP
IBV_EVENT_LID_CHANGE Port Info IB
IBV_EVENT_PKEY_CHANGE Port Info IB
IBV_EVENT_GID_CHANGE Port Info IB, RoCE
IBV_EVENT_SM_CHANGE Port Info IB
IBV_EVENT_CLIENT_REREGISTER Port Info IB
IBV_EVENT_PORT_ERR Port Error IB, RoCE, iWARP
IBV_EVENT_DEVICE_FATAL Device Error IB, RoCE, iWARP

 参数

Name Direction Description
context in

从 ibv_open_device() 返回的 RDMA 设备上下文

event out 发生的异步事件

返回值 

Value Description
0 On success
-1
If blocking mode: there is an error
If non-blocking mode: there isn't any async event to read

例子 

1)读取异步事件(以阻塞方式)并打印其上下文:

/* helper function to print the content of the async event */
static void print_async_event(struct ibv_context *ctx,
			      struct ibv_async_event *event)
{
	switch (event->event_type) {
	/* QP events */
	case IBV_EVENT_QP_FATAL:
		printf("QP fatal event for QP with handle %p\n", event->element.qp);
		break;
	case IBV_EVENT_QP_REQ_ERR:
		printf("QP Requestor error for QP with handle %p\n", event->element.qp);
		break;
	case IBV_EVENT_QP_ACCESS_ERR:
		printf("QP access error event for QP with handle %p\n", event->element.qp);
		break;
	case IBV_EVENT_COMM_EST:
		printf("QP communication established event for QP with handle %p\n", event->element.qp);
		break;
	case IBV_EVENT_SQ_DRAINED:
		printf("QP Send Queue drained event for QP with handle %p\n", event->element.qp);
		break;
	case IBV_EVENT_PATH_MIG:
		printf("QP Path migration loaded event for QP with handle %p\n", event->element.qp);
		break;
	case IBV_EVENT_PATH_MIG_ERR:
		printf("QP Path migration error event for QP with handle %p\n", event->element.qp);
		break;
	case IBV_EVENT_QP_LAST_WQE_REACHED:
		printf("QP last WQE reached event for QP with handle %p\n", event->element.qp);
		break;
 
	/* CQ events */
	case IBV_EVENT_CQ_ERR:
		printf("CQ error for CQ with handle %p\n", event->element.cq);
		break;
 
	/* SRQ events */
	case IBV_EVENT_SRQ_ERR:
		printf("SRQ error for SRQ with handle %p\n", event->element.srq);
		break;
	case IBV_EVENT_SRQ_LIMIT_REACHED:
		printf("SRQ limit reached event for SRQ with handle %p\n", event->element.srq);
		break;
 
	/* Port events */
	case IBV_EVENT_PORT_ACTIVE:
		printf("Port active event for port number %d\n", event->element.port_num);
		break;
	case IBV_EVENT_PORT_ERR:
		printf("Port error event for port number %d\n", event->element.port_num);
		break;
	case IBV_EVENT_LID_CHANGE:
		printf("LID change event for port number %d\n", event->element.port_num);
		break;
	case IBV_EVENT_PKEY_CHANGE:
		printf("P_Key table change event for port number %d\n", event->element.port_num);
		break;
	case IBV_EVENT_GID_CHANGE:
		printf("GID table change event for port number %d\n", event->element.port_num);
		break;
	case IBV_EVENT_SM_CHANGE:
		printf("SM change event for port number %d\n", event->element.port_num);
		break;
	case IBV_EVENT_CLIENT_REREGISTER:
		printf("Client reregister event for port number %d\n", event->element.port_num);
		break;
 
	/* RDMA device events */
	case IBV_EVENT_DEVICE_FATAL:
		printf("Fatal error event for device %s\n", ibv_get_device_name(ctx->device));
		break;
 
	default:
		printf("Unknown event (%d)\n", event->event_type);
	}
}
 
 
 
/* the actual code that reads the events in the loop and prints it */
int ret;
 
while (1) {
	/* wait for the next async event */
	ret = ibv_get_async_event(ctx, &event);
	if (ret) {
		fprintf(stderr, "Error, ibv_get_async_event() failed\n");
		return -1;
	}
 
	/* print the event */
	print_async_event(ctx, &event);
 
	/* ack the event */
	ibv_ack_async_event(&event);
}

2)读取异步事件(以非阻塞方式)并打印其上下文:

int flags;
int ret;
 
printf("Changing the mode of events read to be non-blocking\n");
 
/* change the blocking mode of the async event queue */
flags = fcntl(ctx->async_fd, F_GETFL);
ret = fcntl(ctx->async_fd, F_SETFL, flags | O_NONBLOCK);
if (ret < 0) {
	fprintf(stderr, "Error, failed to change file descriptor of async event queue\n");
	return -1;
}
 
while (1) {
	struct pollfd my_pollfd;
	int ms_timeout = 100;
 
	/*
	 * poll the queue until it has an event and sleep ms_timeout
	 * milliseconds between any iteration
	 */
	my_pollfd.fd      = ctx->async_fd;
	my_pollfd.events  = POLLIN;
	my_pollfd.revents = 0;
	do {
		ret = poll(&my_pollfd, 1, ms_timeout);
	} while (ret == 0);
	if (ret < 0) {
		fprintf(stderr, "poll failed\n");
		return -1;
	}
 
	/* we know that there is an event, so we just need to read it */
	ret = ibv_get_async_event(ctx, &event);
	if (ret) {
		fprintf(stderr, "Error, ibv_get_async_event() failed\n");
		return -1;
	}
 
	/* print the event */
	print_async_event(ctx, &event);
 
	/* ack the event */
	ibv_ack_async_event(&event);
}

async_event.c
async_event_nonblocking.c

FAQs

我必须read 异步事件吗?

No. The asynchronous events mechanism is a way to provide extra information about things that happen in the CQs, QPs, SRQs, ports, devices. The user doesn't have to use it, but it is highly recommended doing so.

我可以不时(例如,每隔几分钟)读取一次事件吗?

Yes, you can. The downside for this is that you won't know when the event happened, and maybe this information is irrelevant anymore.

这个verb是线程安全的吗?

Yes, this verb is thread-safe (just like the rest of the verbs).

我收到了 QP/CQ/SRQ 事件。其他进程也会收到此事件吗?

No. Affiliated events will be generated only to the context that this resource belongs to. Other contexts won't even know that this event occurred.

 

Guess you like

Origin blog.csdn.net/bandaoyu/article/details/120641400