The ACK processing of linux kernel protocol stack TCP data reception

table of Contents

1 Overview of ACK response

2 ACK processing tcp_ack()

2.1 Parameter flag

2.2 tcp_ack() processing flow

2.2.1 Update sending window

2.2.2 Delete the confirmed data segment + RTT sampling

2.2.3 Congestion Control


1 Overview of ACK response

During the processing of the TCP input data segment, if the input segment is found to carry ACK information, tcp_ack() will be called for ACK-related processing. In practice, ACK information will always be carried, because carrying ACK does not require any additional overhead, so for each segment of the input (except for special segments such as RST [direct link disconnection]), this process is always performed. Let's take a look at the TCP processing of ACK confirmation in this note.

2 ACK processing tcp_ack()

2.1 Parameter flag

tcp_ack() has a very important parameter flag, which runs through the entire ACK processing process. It records any information that can be obtained from the input section (such as whether data is carried, whether ACK is repeated, whether it is SACK, etc.). Refer to the following operations such as congestion control and RTT sampling. The flag may be a combination of the following values.

flag value description
FLAG_DATA 0x01 ACK segment carries data
FLAG_WIN_UPDATE 0x02 After receiving the ACK segment, the sending window is updated, the left boundary may be updated, or the right boundary may be updated (the notification window becomes larger)
FLAG_DATA_ACKED 0x04 ACK segment confirms new data
FLAG_RETRANS_DATA_ACKED 0x08 The data carried in the ACK segment has been received
FLAG_SYN_ACKED 0x10 ACK segment confirms SYN segment
FLAG_DATA_SACKED 0x20 The ACK segment confirms the new data
FLAG_ECE 0x40 The ACK segment carries the ECE flag
FLAG_DATA_LOST 0x80 SACK detected data loss
FLAG_SLOWPATH 0x100 The ACK segment is processed by the slow path
FLAG_ONLY_ORIG_SACKED 0x200  
FLAG_SND_UNA_ADVANCED 0x400 The ACK segment has updated snd_una, that is, after receiving the ACK, the left boundary of the sending window can be moved to the right
FLAG_DSACKING_ACK 0x800 The ACK segment contains DSACK information
FLAG_NONHEAD_RETRANS_ACKED 0x1000  
FLAG_SACK_RENEGING 0x2000 It is detected that the data segment confirmed by the previous SACK is discarded by the opposite end (this is allowed by the protocol)

In addition, some basic flag combinations are defined:

#define FLAG_ACKED		(FLAG_DATA_ACKED|FLAG_SYN_ACKED)
//用于判断输入的数据段是否为重复段
#define FLAG_NOT_DUP		(FLAG_DATA|FLAG_WIN_UPDATE|FLAG_ACKED)
#define FLAG_CA_ALERT		(FLAG_DATA_SACKED|FLAG_ECE)
#define FLAG_FORWARD_PROGRESS	(FLAG_ACKED|FLAG_DATA_SACKED)
#define FLAG_ANY_PROGRESS	(FLAG_FORWARD_PROGRESS|FLAG_SND_UNA_ADVANCED)

2.2 tcp_ack() processing flow

The core operation of tcp_ack() does three things:

  1. Update the sending window;
  2. Clear the confirmed data (including retransmission data) in the sending queue and perform RTT sampling;
  3. Perform congestion control.
/* This routine deals with incoming acks, but not outgoing ones. */
static int tcp_ack(struct sock *sk, struct sk_buff *skb, int flag)
{
	struct inet_connection_sock *icsk = inet_csk(sk);
	struct tcp_sock *tp = tcp_sk(sk);
	//TCB中尚未被确认的最小序号
	u32 prior_snd_una = tp->snd_una;
	//ACK段中的序号
	u32 ack_seq = TCP_SKB_CB(skb)->seq;
	//ACK段中的确认号
	u32 ack = TCP_SKB_CB(skb)->ack_seq;
	u32 prior_in_flight;
	u32 prior_fackets;
	int prior_packets;
	int frto_cwnd = 0;

	/* If the ack is newer than sent or older than previous acks
	 * then we can probably ignore it.
	 */
	//确认的是还没有发送的数据,这是无意义的确认,直接返回
	if (after(ack, tp->snd_nxt))
		goto uninteresting_ack;
	//该确认号已经收到过了。这种可能是重复ACK,也有可能是正常的,比如该AC段有延时。
	//这种ACK有可能还携带了有效的SACK信息
	if (before(ack, prior_snd_una))
		goto old_ack;

	//到这里,说明确认号在期望的范围内[snd_una, snd_nxt],

	//确认号确认了新数据,设置FLAG_SND_UNA_ADVANCED标记。
	//if判断是为了排除ack==prior_snd_una的情况
	if (after(ack, prior_snd_una))
		flag |= FLAG_SND_UNA_ADVANCED;

	//tcp_abc特性相关
	if (sysctl_tcp_abc) {
		if (icsk->icsk_ca_state < TCP_CA_CWR)
			tp->bytes_acked += ack - prior_snd_una;
		else if (icsk->icsk_ca_state == TCP_CA_Loss)
			/* we assume just one segment left network */
			tp->bytes_acked += min(ack - prior_snd_una, tp->mss_cache);
	}

	prior_fackets = tp->fackets_out;
	prior_in_flight = tcp_packets_in_flight(tp);

	//下面是更新发送窗口,按照快速路径和慢速路径分别处理
	if (!(flag & FLAG_SLOWPATH) && after(ack, prior_snd_una)) {
		/* Window is constant, pure forward advance.
		 * No more checks are required.
		 * Note, we use the fact that SND.UNA>=SND.WL2.
		 */
		//记录最近一次导致发送窗口更新的ACK段的序号,即tp->snd_wl1=ack_seq
		tcp_update_wl(tp, ack, ack_seq);
		//更新发送窗口左边界
		tp->snd_una = ack;
		//设置发送窗口更新标记
		flag |= FLAG_WIN_UPDATE;
		//通知拥塞控制算法,发生了CA_EVENT_FAST_ACK事件
		tcp_ca_event(sk, CA_EVENT_FAST_ACK);

		NET_INC_STATS_BH(LINUX_MIB_TCPHPACKS);
	} else {
		//慢速路径处理

		//ACK段还携带了数据,设置FLAG_DATA标记
		if (ack_seq != TCP_SKB_CB(skb)->end_seq)
			flag |= FLAG_DATA;
		else
			NET_INC_STATS_BH(LINUX_MIB_TCPPUREACKS);
		//更新发送窗口
		flag |= tcp_ack_update_window(sk, skb, ack, ack_seq);

		//SACK相关处理
		if (TCP_SKB_CB(skb)->sacked)
			flag |= tcp_sacktag_write_queue(sk, skb, prior_snd_una);
		//ECN相关处理
		if (TCP_ECN_rcv_ecn_echo(tp, tcp_hdr(skb)))
			flag |= FLAG_ECE;
		//通知拥塞控制算法,发生了CA_EVENT_SLOW_ACK事件
		tcp_ca_event(sk, CA_EVENT_SLOW_ACK);
	}

	//清除软件错误
	sk->sk_err_soft = 0;
	//更新最近一次接收到ACK段的时间戳
	tp->rcv_tstamp = tcp_time_stamp;
	//如果之前根本就没有待确认的段,那么无需后续的重传队列以及拥塞控制处理;
	//这种情况下需要做和持续定时器相关的操作,因为可能之前传送过探测报文
	prior_packets = tp->packets_out;
	if (!prior_packets)
		goto no_queue;

	//删除重传队列中已经确认的数据段,并进行时延采样
	flag |= tcp_clean_rtx_queue(sk, prior_fackets);

	//F-RTO算法相关内容
	if (tp->frto_counter)
		frto_cwnd = tcp_process_frto(sk, flag);
	/* Guarantee sacktag reordering detection against wrap-arounds */
	if (before(tp->frto_highmark, tp->snd_una))
		tp->frto_highmark = 0;

	//拥塞控制相关
	if (tcp_ack_is_dubious(sk, flag)) {
		/* Advance CWND, if state allows this. */
		if ((flag & FLAG_DATA_ACKED) && !frto_cwnd &&
		    tcp_may_raise_cwnd(sk, flag))
			tcp_cong_avoid(sk, ack, prior_in_flight);
		tcp_fastretrans_alert(sk, prior_packets - tp->packets_out,
				      flag);
	} else {
		if ((flag & FLAG_DATA_ACKED) && !frto_cwnd)
			tcp_cong_avoid(sk, ack, prior_in_flight);
	}

	if ((flag & FLAG_FORWARD_PROGRESS) || !(flag & FLAG_NOT_DUP))
		dst_confirm(sk->sk_dst_cache);

	return 1;

no_queue:
	//之前没有未被确认的段,收到了ACK,进行持续定时器相关处理
	icsk->icsk_probes_out = 0;

	/* If this ack opens up a zero window, clear backoff.  It was
	 * being used to time the probes, and is probably far higher than
	 * it needs to be for normal retransmission.
	 */
	if (tcp_send_head(sk))
		tcp_ack_probe(sk);
	return 1;

old_ack:
	//虽然该ACK已经收到过了,但是如果其携带了SACK信息,需要更新确认内容
	if (TCP_SKB_CB(skb)->sacked)
		tcp_sacktag_write_queue(sk, skb, prior_snd_una);

uninteresting_ack:
	SOCK_DEBUG(sk, "Ack %u out of %u:%u\n", ack, tp->snd_una, tp->snd_nxt);
	return 0;
}

2.2.1 Update sending window

"Linux Kernel Protocol Stack TCP Data Transmission Window"

2.2.2 Delete the confirmed data segment + RTT sampling

"Linux kernel protocol stack TCP data receiving clear send queue + RTT sampling"

2.2.3 Congestion Control

 

Guess you like

Origin blog.csdn.net/wangquan1992/article/details/109033972