[Turn] a very interesting question related to the tcp

Turn: https://segmentfault.com/a/1190000019989683

-------------------------------------------------------------------------

Assume the following scenario:

After tcp connection is established, to take the initiative to shut down its server, then re-write it in the client socket, normal thinking will think that this is certainly a write operation returns an error, right?

Really necessarily.

Today when writing code encountered this problem, also tangled quite a long time, finally turned the linux kernel source code before deciding on the answer.

First with the following program to simulate this scenario:

#include <arpa/inet.h>
#include <assert.h> #include <netinet/in.h> #include <signal.h> #include <stdio.h> #include <strings.h> #include <sys/socket.h> #include <sys/types.h> #include <unistd.h> int tcp_connect() { int sockfd, err; struct sockaddr_in addr; sockfd = socket(AF_INET, SOCK_STREAM, 0); assert(sockfd != -1); bzero(&addr, sizeof(addr)); addr.sin_family = AF_INET; addr.sin_addr.s_addr = inet_addr("127.0.0.1"); addr.sin_port = htons(9999); err = connect(sockfd, (struct sockaddr *)&addr, sizeof(addr)); assert(err == 0); return sockfd; } int main(int argc, char **argv) { int n; int sockfd = tcp_connect(); signal(SIGPIPE, SIG_IGN); // 防止write触发SIGPIPE,便于测试 printf("请于5秒钟内关闭服务端...\n"); sleep(5); // write 1 n = write(sockfd, "hello\n", 6); if (n == -1) { perror("第一次write失败"); return -1; } assert(n == 6); printf("第一次write成功!\n"); sleep(1); // 确保客户端收到tcp的reset消息 // write 2 n = write(sockfd, "world\n", 6); if (n == -1) { perror("第二次write失败"); return -1; } assert(n == 6); printf("第二次write成功!\n"); return 0; }

This program on behalf of the client, the server will use to simulate ncat.

Here is the flow of execution:

First open a terminal, with ncat open a server:

$ ncat -l 9999

Then open another terminal, compile the above program, then execute:

$ gcc main.c
$ ./a.out
请于5秒钟内关闭服务端...
第一次write成功!
第二次write失败: Broken pipe

When prompted to close the client server, to switch to the corresponding Terminal, shut down the server.

We can see from the above output twice after writing, for the first time succeeded, second only to fail.

Strange it.

We look tcpdump packet capture, whether for the first time is really written a success:

$ sudo tcpdump -i any -n# port 9999
    1  17:59:07.812599 IP 127.0.0.1.51614 > 127.0.0.1.9999: Flags [S], seq 1076934668, win 65495, options [mss 65495,sackOK,TS val 134308422 ecr 0,nop,wscale 7], length 0 2 17:59:07.812648 IP 127.0.0.1.9999 > 127.0.0.1.51614: Flags [S.], seq 3833531274, ack 1076934669, win 65483, options [mss 65495,sackOK,TS val 134308422 ecr 134308422,nop,wscale 7], length 0 3 17:59:07.812691 IP 127.0.0.1.51614 > 127.0.0.1.9999: Flags [.], ack 1, win 512, options [nop,nop,TS val 134308422 ecr 134308422], length 0 4 17:59:09.832579 IP 127.0.0.1.9999 > 127.0.0.1.51614: Flags [F.], seq 1, ack 1, win 512, options [nop,nop,TS val 134310442 ecr 134308422], length 0 5 17:59:09.835181 IP 127.0.0.1.51614 > 127.0.0.1.9999: Flags [.], ack 2, win 512, options [nop,nop,TS val 134310445 ecr 134310442], length 0 6 17:59:12.813697 IP 127.0.0.1.51614 > 127.0.0.1.9999: Flags [P.], seq 1:7, ack 2, win 512, options [nop,nop,TS val 134313423 ecr 134310442], length 6 7 17:59:12.813735 IP 127.0.0.1.9999 > 127.0.0.1.51614: Flags [R], seq 3833531276, win 0, length 0

Really successful, see above sixth packet length of the transmitted data is 6, namely: our code hellon.

Here probably explain tcpdump output:

The first three tcp package is a three-way handshake, and after the completion of representatives tcp connection is established successfully.

The fourth package is when we shut down the server, the server to the client's fin package disables the connection request.

The fifth package is sent ack tcp client server layer, and said it had received a fin package.

The sixth package is sent to the client hello \ n string server-side.

The seventh package is a server-side layers tcp reset packet sent to the client, because the socket server has been shut down.

Tcpdump can be determined by the output of the first write is to write really successful, but why? Obviously socket server have been closed, why you can also send it? And why for the first time can be sent, the second to die of it?

Look at the kernel source code is how to do:

// net/ipv4/tcp_input.c
int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) { ... err = -EPIPE; if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN)) goto do_error; ... // 省略这部分是tcp发送数据的代码 ... return copied + copied_syn; ... do_error: ... return err; } EXPORT_SYMBOL_GPL(tcp_sendmsg_locked);

This method is the message of tcp method.

As seen above, only when a socket error occurs or we closed the send-side socket, the above write method will return an error, in other cases, the write data will be sent normally.

We can know by the relevant knowledge of tcp, when the server sends fin message to the client, the client socket into CLOSE_WAIT state, namely: waiting for the client to close its socket.

In other words, fin news did not make the client socket error occurred, did not close the send client socket end (but closed the receive end of the client socket), so the first time you write a successful send data out .

Why write that second failure?

See above tcpdump output will know, when after the first write, the operating system of the server receives the data, find the corresponding socket has been closed, so we sent a reset packet to the client.

After receiving the client reset packet, the following code is executed:

// net/ipv4/tcp_input.c
void tcp_reset(struct sock *sk)
{
        ...
        switch (sk->sk_state) {
        ...
        case TCP_CLOSE_WAIT:
                sk->sk_err = EPIPE;
                break;
        ...
        }
        ...
        tcp_done(sk);
        ...
}

As seen above, sk-> sk_err been set to EPIPE, in fact, in the following tcp_done method, the send-side socket has been closed, but this has little effect on the.

So, when we write a second call, when performing the tcp_sendmsg_locked method, you jump directly to the do_error, namely: err returned to the user.

At this point, it perfectly explains why there is a strange phenomenon described above.

In fact, we do not look at the code, think about the details of tcp, and it is to be understood that the operating system why there is such behavior.

Before the first write, receive our socket fin package, CLOSE_WAIT into the state, this time, in fact, does not mean that the server has completely closed the connection, it may also be sent fin package, just to close its send end but it can still be read, so we should also continue to write.

I think so much easier to understand some of it.

However, from the point of view of the source of this problem, or to some of the more solid.

If there are interested students tcp source, you may have a look before I wrote a series of articles tcp source code analysis:

TCP / IP state transition diagrams and source code analysis of the list of articles

Finish.

Guess you like

Origin www.cnblogs.com/oxspirt/p/12152888.html