What is the stick pack phenomenon

Simple remote command execution program development (30 minutes)

When the user socket is dry point down to business Yeah, we have to write a remote command execution program, write a socket client sends an end instruction in the windows, a socket server execute commands on the Linux side and return the results to the client

Execute commands, then certainly with our learned friends subprocess module, but Notes Note:

res = subprocess.Popen(cmd.decode('utf-8'),shell=True,stderr=subprocess.PIPE,stdout=subprocess.PIPE)

Coding command results are subject to the current system where, if it is windows, then res.stdout.read () read GBK is encoded at the receiving end need to use GBK decoding, and can only be read from the pipe a result

ssh server

import socket
import subprocess

ip_port = ('127.0.0.1', 8080)


tcp_socket_server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
tcp_socket_server.bind(ip_port)
tcp_socket_server.listen(5)

while True:
    conn, addr = tcp_socket_server.accept()
    print('客户端', addr)

    while True:
        cmd = conn.recv(1024)
        if len(cmd) == 0: break
        print("recv cmd",cmd)
        res = subprocess.Popen(cmd.decode('utf-8'), shell=True,
                               stdout=subprocess.PIPE,
                               stdin=subprocess.PIPE,
                               stderr=subprocess.PIPE)

        stderr = res.stderr.read()
        stdout = res.stdout.read()
        print("res length",len(stdout))
        conn.send(stderr)
        conn.send(stdout)

ssh client

import socket
ip_port = ('127.0.0.1', 8080)

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
res = s.connect_ex(ip_port)

while True:
    msg = input('>>: ').strip()
    if len(msg) == 0: continue
    if msg == 'quit': break

    s.send(msg.encode('utf-8'))
    act_res = s.recv(1024)

    print(act_res.decode('utf-8'), end='')

Try to execute ls, pwd command, you are surprised to find, to get the correct result!

but Mo happy too early, at this time the result of the implementation of a long commands, such as top -bn 1, you still find that you can get results, but if we perform a df -h, then you find that you get is not the df command results, but on a top command of partial results. Why squeeze? Why squeeze? Why squeeze?

Because the result of the top command of the relatively long, but the client only recv (1024), can be the result of more than 1024 to grow here, how to do, but to the IO buffer server where the client also confiscated away temporarily survive, and so on the client received next time, so when a client 2nd call recv (1024) will be the first to finish the last confiscated data received the first down, and then received the results of the df command.

Then how to solve it? Some students say that directly to the recv (1024) does not change big enough, into 5000 \ 10000, or whatever. But my pro, so dry, it does not solve the real problem, because you can not know in advance the results returned by each other big big data under, no matter how much you change, the other results are likely to set up than you, while this is not really recv can easily change particularly large, authorities recommended not more than 8192, but there will be another big impact sending and receiving of speed and instability

Comrades, this phenomenon is called stick package, meaning the two results to stick together. It occurs mainly because the socket buffer caused a look

Your program is actually no right to direct the operation of the card, the card operation you are exposed to the user program through the operating system interface, the program is that every time you send data to give remote, in fact, is to first copy the data from user mode the kernel mode, such operation is resource intensive and time, frequently prior to the kernel mode and user mode switched data will inevitably lead to reducing transmission efficiency, and therefore socket to improve the transmission efficiency, the sender often enough data collected after send data to each other only once. If successive send data are rarely needed, typically the TCP socket will transmit data according to the synthesis of an optimization algorithm a TCP segment out, so that the receiving side receives the data packet sticky.

Stick package is only a problem in the TCP, Not UDP

Or fancy FIG sending end may transmit data a K-K, the receiving-side applications can be two K two K to withdraw data, of course, also possible to first withdraw 3K or 6K data, or only once withdrawn a few bytes of data, that is, the application data is seen as a whole, or that is a stream (stream), a message how many bytes the application is not visible, so the TCP protocol is stream oriented agreement, which is likely to stick package causes of problem. While UDP is a message-oriented protocol, each segment is a UDP message, in the application must extract the data units of the message can not be extracted any one-byte data, and it is different from TCP. How to define news? That the other side may be disposable write / send data as a message, to be understood that when the other send a message, no matter what the underlying tile segment, TCP protocol layer will sort the data segments constituting the entire message after completion presented in the kernel buffer.

For example tcp socket-based client to upload files to the server, send the file content is in accordance with a section of the stream of bytes sent, I looked at the receiver, did not know of bytes of the file stream where to start, where to stop

The so-called stick package problems mainly because the receiver does not know the limits between the message, does not know how many bytes of data caused by the time extraction.

to sum up

  1. TCP (transport control protocol, the Transmission Control Protocol) is connection-oriented and stream-oriented, high reliability service. The receiver and transmitter (client and server) must have eleven pairs of socket, therefore, in order to transmit end a plurality of packets addressed to the receiving end, other more efficient to send, using the optimization method (the Nagle Algorithm), the times, at intervals smaller and smaller amount of data the data combined into a large block of data, then the packet. In this way, the receiving end, it is hard to tell the difference, and must provide scientific unpacking mechanism. I.e., non-oriented communication message stream is protected boundaries.
  2. UDP (user datagram protocol, User Datagram Protocol) is a connectionless, message-oriented, providing efficient service. The combined use of the optimization algorithm does not support block ,, since UDP is a many mode, the receiving end of the skbuff (socket buffer) using the chain structure to record the arrival of each UDP packet in each UDP We have a package (information source address and port) message header, so that, for the receiving side, it is easy to distinguish the process. That message is a message-oriented communication protected boundaries.
  3. tcp is based on the data stream, so messages sent and received can not be empty, which requires all add empty message handling mechanism in the client and server to prevent jamming program, which is based on udp datagram, even if you enter empty content (direct carriage return), it was not an empty message, udp protocols will help you package the message headers, a little experiment

UDP-based command execution program (10 minutes)

Above that, udp stick package is not a problem, we look at the example

udp server

import socket
import subprocess

ip_port = ('127.0.0.1', 9003)
bufsize = 1024

udp_server = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
udp_server.bind(ip_port)

while True:
    # 收消息
    cmd, addr = udp_server.recvfrom(bufsize)
    print('用户命令----->', cmd,addr)

    # 逻辑处理
    res = subprocess.Popen(cmd.decode('utf-8'), shell=True, stderr=subprocess.PIPE, stdin=subprocess.PIPE,
                           stdout=subprocess.PIPE)
    stderr = res.stderr.read()
    stdout = res.stdout.read()

    # 发消息
    udp_server.sendto(stdout + stderr, addr)

udp_server.close()

udp client

from socket import *

import time

ip_port = ('127.0.0.1', 9003)
bufsize = 1024

udp_client = socket(AF_INET, SOCK_DGRAM)

while True:
    msg = input('>>: ').strip()
    if len(msg) == 0:
        continue

    udp_client.sendto(msg.encode('utf-8'), ip_port)
    data, addr = udp_client.recvfrom(bufsize)
    print(data.decode('utf-8'), end='')

 

Sticky solution package (35 minutes)

Root of the problem is that the receiver does not know the length of the byte stream of the sender to be transferred, so the solution stick package is around, how to get the sender before sending the data, byte stream, the total size of their own that will be sent to enable the receiver its end, and then fetched reception cycle of death has received all data

Ordinary young version

Service-Terminal

import socket,subprocess
ip_port=('127.0.0.1',8080)
s=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

s.bind(ip_port)
s.listen(5)

while True:
    conn,addr=s.accept()
    print('客户端',addr)
    while True:
        msg=conn.recv(1024)
        if not msg:break
        res=subprocess.Popen(msg.decode('utf-8'),shell=True,\
                            stdin=subprocess.PIPE,\
                         stderr=subprocess.PIPE,\
                         stdout=subprocess.PIPE)
        err=res.stderr.read()
        if err:
            ret=err
        else:
            ret=res.stdout.read()
        data_length=len(ret)
        conn.send(str(data_length).encode('utf-8'))
        data=conn.recv(1024).decode('utf-8')
        if data == 'recv_ready':
            conn.sendall(ret)
    conn.close()

Client

import socket,time
s=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
res=s.connect_ex(('127.0.0.1',8080))

while True:
    msg=input('>>: ').strip()
    if len(msg) == 0:continue
    if msg == 'quit':break

    s.send(msg.encode('utf-8'))
    length=int(s.recv(1024).decode('utf-8'))
    s.send('recv_ready'.encode('utf-8'))
    send_size=0
    recv_size=0
    data=b''
    while recv_size < length:
        data+=s.recv(1024)
        recv_size+=len(data) #为什么不直接写1024?


    print(data.decode('utf-8'))

Why low?

Running speed much faster than the transmission speed of the network, so before sending some bytes prior to the transmission of the byte stream length with send, this embodiment amplifies the performance cost of network latency

Just above must first send a message length before sending a message to the peer, but also must wait for the end of the return confirmation a ready message was received, without waiting for an acknowledgment direct message, then it will have stick package problems (bearer message length of that message and the message itself stick together). There is no good way to optimize it?

A version of young artists

Think about a problem, why not send a message length (called a message header head of it) to the end of the immediately message content (body called it), because of fear of head and body stick together, so on and so on through the end returns an acknowledgment message to disconnect the two.

It can be sent directly head + body, but can make peer to distinguish which is the head, which is the body of it? I rely on, I rely on intelligence to feel up.

I thought, the head die arranged fixed length, so long as the end time of the received message, the first fixed length set of data received, written in the head, and there is much data belonging to this message, and then directly write cycles closing down not finished Well! Oh OMG, I really witty.

But, but how to make the fixed-length header it? Suppose you have a message to send 2, a first message length is 3000 bytes, Message 2 is 200 bytes. If the message contains only the message header length, then the first two messages are messages

len(msg1) = 4000 = 4字节
len(msg2) = 200 = 3字节

How to complete your server receives the message header it? Is recv (3) or recv (4) server know how? Exhausted all my knowledge, I can only think of a way to concatenate strings, the analogy is to set a fixed header 100 bytes long, not enough to take the empty string splicing.

server

import socket,json
import subprocess

ip_port = ('127.0.0.1', 8080)

tcp_socket_server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
tcp_socket_server.setsockopt(socket.SOL_SOCKET,socket.SO_REUSEADDR,1) #一行代码搞定,写在bind之前
tcp_socket_server.bind(ip_port)
tcp_socket_server.listen(5)

def pack_msg_header(header,size):
    bytes_header = bytes(json.dumps(header),encoding="utf-8")
    fill_up_size = size -  len(bytes_header)
    print("need to fill up ",fill_up_size)

    header['fill'] = header['fill'].zfill(fill_up_size)
    print("new header",header)
    bytes_new_header = bytes(bytes(json.dumps(header),encoding="utf-8"))
    return bytes_new_header

while True:
    conn, addr = tcp_socket_server.accept()
    print('客户端', addr)

    while True:
        cmd = conn.recv(1024)
        if len(cmd) == 0: break
        print("recv cmd",cmd)
        res = subprocess.Popen(cmd.decode('utf-8'), shell=True,
                               stdout=subprocess.PIPE,
                               stdin=subprocess.PIPE,
                               stderr=subprocess.PIPE)

        stderr = res.stderr.read()
        stdout = res.stdout.read()
        print("res length",len(stdout))

        msg_header = {
            'length':len(stdout + stderr),
            'fill':''
        }
        packed_header = pack_msg_header(msg_header,100)
        print("packed header size",packed_header,len(packed_header))
        conn.send(packed_header)
        conn.send(stdout + stderr)

client

import socket
import json

ip_port = ('127.0.0.1', 8080)

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
res = s.connect_ex(ip_port)

while True:
    msg = input('>>: ').strip()
    if len(msg) == 0: continue
    if msg == 'quit': break

    s.send(msg.encode('utf-8'))
    response_msg_header = s.recv(100).decode("utf-8")

    response_msg_header_data = json.loads(response_msg_header)
    msg_size = response_msg_header_data['length']

    res = s.recv(msg_size)
    print("received res size ",len(res))
    print(res.decode('utf-8'), end='')

Young artists edition

Custom stream of bytes plus a fixed length header may be by means of a third-party modules struct, use of

import json,struct
#假设通过客户端上传1T:1073741824000的文件a.txt

#为避免粘包,必须自定制报头
header={'file_size':1073741824000,'file_name':'/a/b/c/d/e/a.txt','md5':'8f6fbf8347faa4924a76856701edb0f3'} #1T数据,文件路径和md5值

#为了该报头能传送,需要序列化并且转为bytes
head_bytes=bytes(json.dumps(header),encoding='utf-8') #序列化并转成bytes,用于传输

#为了让客户端知道报头的长度,用struck将报头长度这个数字转成固定长度:4个字节
head_len_bytes=struct.pack('i',len(head_bytes)) #这4个字节里只包含了一个数字,该数字是报头的长度

#客户端开始发送
conn.send(head_len_bytes) #先发报头的长度,4个bytes
conn.send(head_bytes) #再发报头的字节格式
conn.sendall(文件内容) #然后发真实内容的字节格式

#服务端开始接收
head_len_bytes=s.recv(4) #先收报头4个bytes,得到报头长度的字节格式
x=struct.unpack('i',head_len_bytes)[0] #提取报头的长度

head_bytes=s.recv(x) #按照报头长度x,收取报头的bytes格式
header=json.loads(json.dumps(header)) #提取报头

#最后根据报头的内容提取真实的数据,比如
real_data_len=s.recv(header['file_size'])
s.recv(real_data_len)

Struct module implemented as follows using

server

import socket,struct,json
import subprocess
phone=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
phone.setsockopt(socket.SOL_SOCKET,socket.SO_REUSEADDR,1) #就是它,在bind前加

phone.bind(('127.0.0.1',8080))

phone.listen(5)

while True:
    conn,addr=phone.accept()
    while True:
        cmd=conn.recv(1024)
        if not cmd:break
        print('cmd: %s' %cmd)

        res=subprocess.Popen(cmd.decode('utf-8'),
                             shell=True,
                             stdout=subprocess.PIPE,
                             stderr=subprocess.PIPE)
        err=res.stderr.read()
        print(err)
        if err:
            back_msg=err
        else:
            back_msg=res.stdout.read()

        headers={'data_size':len(back_msg)}
        head_json=json.dumps(headers)
        head_json_bytes=bytes(head_json,encoding='utf-8')

        conn.send(struct.pack('i',len(head_json_bytes))) #先发报头的长度
        conn.send(head_json_bytes) #再发报头
        conn.sendall(back_msg) #在发真实的内容

    conn.close()

client

from socket import *
import struct,json

ip_port=('127.0.0.1',8080)
client=socket(AF_INET,SOCK_STREAM)
client.connect(ip_port)

while True:
    cmd=input('>>: ')
    if not cmd:continue
    client.send(bytes(cmd,encoding='utf-8'))

    head=client.recv(4) #先收4个bytes,这里4个bytes里包含了报头的长度
    head_json_len=struct.unpack('i',head)[0] #解出报头的长度
    head_json=json.loads(client.recv(head_json_len).decode('utf-8')) #拿到报头
    data_len=head_json['data_size'] #取出报头内包含的信息

    #开始收数据
    recv_size=0
    recv_data=b''
    while recv_size < data_len:
        recv_data+=client.recv(1024)
        recv_size=len(recv_data)

    print(recv_data.decode('utf-8'))
    #print(recv_data.decode('gbk')) #windows默认gbk编码

Guess you like

Origin www.cnblogs.com/yuexijun/p/11410305.html