python socket - TCP method to solve the stick package

1. Why is there stick package ??

 

Let us first make a tcp-based remote command execution program (1: command execution error 2: execute ls 3: execute ifconfig)

Caution Caution:

res=subprocess.Popen(cmd.decode('utf-8'),
shell=True,
stderr=subprocess.PIPE,
stdout=subprocess.PIPE)

The coding system is based on the results of the current location of the subject, if it is windows, then res.stdout.read () read GBK is encoded at the receiving end need to use GBK decode

 

Transmitting end may be a K transmission data a K, the receiving side application can two K two K to withdraw data, of course, also possible to first withdraw 3K or 6K data, or only once withdraw a few bytes of data, in other words, the application sees the data as a whole, or that is a stream (stream), a message how many bytes the application is not visible, so the TCP protocol is a stream-oriented protocol, which is easy the reason stick package problem. While UDP is a message-oriented protocol, each segment is a UDP message, in the application must extract the data units of the message can not be extracted any one-byte data, and it is different from TCP. How to define news? That the other side may be disposable write / send data as a message, to be understood that when the other send a message, no matter what the underlying tile segment, TCP protocol layer will sort the data segments constituting the entire message after completion presented in the kernel buffer.

For example tcp socket-based client to upload files to the server, send the file content is in accordance with a section of the stream of bytes sent, I looked at the receiver, did not know of bytes of the file stream where to start, where to stop

The so-called stick package problems mainly because the receiver does not know the limits between the message, does not know how many bytes of data caused by the time extraction.

Further, due to the sender stick package is itself caused by the TCP protocol, TCP to improve the transmission efficiency, the sender often collect enough data before sending a TCP segment. If successive send data are rarely needed, usually based on TCP optimization algorithms to the data transmission time a TCP segment out after the synthesis, so that the receiving side receives the data packet sticky.

  1. TCP (transport control protocol, the Transmission Control Protocol) is connection-oriented and stream-oriented, high reliability service. The receiver and transmitter (client and server) must have eleven pairs of socket, therefore, in order to transmit end a plurality of packets addressed to the receiving end, other more efficient to send, using the optimization method (the Nagle Algorithm), the times, at intervals smaller and smaller amount of data the data combined into a large block of data, then the packet. In this way, the receiving end, it is hard to tell the difference, and must provide scientific unpacking mechanism. I.e., non-oriented communication message stream is protected boundaries.
  2. UDP (user datagram protocol, User Datagram Protocol) is a connectionless, message-oriented, providing efficient service. The combined use of the optimization algorithm does not support block ,, since UDP is a many mode, the receiving end of the skbuff (socket buffer) using the chain structure to record the arrival of each UDP packet in each UDP We have a package (information source address and port) message header, so that, for the receiving side, it is easy to distinguish the process. That message is a message-oriented communication protected boundaries.
  3. tcp is based on the data stream, so messages sent and received can not be empty, which requires all add empty message handling mechanism in the client and server to prevent jamming program, which is based on udp datagram, even if you enter empty content (direct carriage return), it was not an empty message, udp protocols will help you package the message headers, a little experiment

udp The recvfrom is blocked, a recvfrom (x) must only sendinto (y), ended all x bytes of data even if completed, if y> x data is lost, which means that will not stick udp packet, but it will lose the data, unreliable

Tcp protocol data is not lost, did not receive complete package, the next reception, will continue to continue to receive the last, had always end will clear the buffer contents upon receipt ack. The data is reliable, but will stick package.

 

Stick package will occur in both cases.

The transmitting end only needs to send out the buffer is full and the like, resulting in stick package (data transmission time interval is short, the data is small, to join together to produce stick package)

#_*_coding:utf-8_*_
__author__ = 'Linhaifeng'
from socket import *
ip_port=('127.0.0.1',8080)

tcp_socket_server=socket(AF_INET,SOCK_STREAM)
tcp_socket_server.bind(ip_port)
tcp_socket_server.listen(5)


conn,addr=tcp_socket_server.accept()


data1=conn.recv(10)
data2=conn.recv(10)

print('----->',data1.decode('utf-8'))
print('----->',data2.decode('utf-8'))

conn.close()

服务端
#_*_coding:utf-8_*_
__author__ = 'Linhaifeng'
import socket
BUFSIZE=1024
ip_port=('127.0.0.1',8080)

s=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
res=s.connect_ex(ip_port)


s.send('hello'.encode('utf-8'))
s.send('feng'.encode('utf-8'))

客户端

The recipient is not timely received packet buffers, resulting in multiple packet receive (a piece of data sent by the client, the server received only a small portion of the server the next time or take the time to close the last remaining data from the buffer generating stick package) 

Coding _ * _ #: _ * _. 8 UTF- 
__author__ = 'Linhaifeng' 
from Import * Socket 
for ip_port = ( '127.0.0.1', 8080) 

tcp_socket_server = Socket (AF_INET, SOCK_STREAM) 
tcp_socket_server.bind (for ip_port) 
tcp_socket_server.listen (. 5) 


conn, addr = tcp_socket_server.accept () 


data1 = conn.recv (2) # does not receive a complete 
data2 = conn.recv (10) # close the next time, will first take the old data, and then take the new 

print ( ' -----> ', data1.decode (' UTF-. 8 ')) 
Print (' -----> ', data2.decode (' UTF-. 8 ')) 

conn.Close () 

server

  

#_*_coding:utf-8_*_
__author__ = 'Linhaifeng'
import socket
BUFSIZE=1024
ip_port=('127.0.0.1',8080)

s=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
res=s.connect_ex(ip_port)


s.send('hello feng'.encode('utf-8'))

客户端

  

The occurrence of unpacking

When the length of the transmission side buffer is larger than the MTU of the network card, the data will be transmitted tcp split into several data packets sent.

A supplementary question: Why is the reliable transmission tcp, udp is unreliable transmission

Tcp-based data transmission, please refer to my other article http://www.cnblogs.com/linhaifeng/articles/5937962.html,tcp during data transfer, the sender sends the data to put its own cache, then the cache control protocol data sent to the peer, the peer returns an ack = 1, a transmitting side data buffer is clean, the peer returns ack = 0, the re-transmission data, so reliable tcp

Udp and transmitting data, the peer returns an acknowledgment message is not therefore unreliable

Supplementary two: send (byte stream) and the recv (1024) and sendall

recv means 1024 specified in a 1024-byte data out from the cache

The byte stream is to send into the hexyl side cache, then the cache is controlled by a protocol to send the content to the end, if the size of the byte stream to be transmitted is greater than the remaining buffer space, then the data is lost, it will cycle with sendall send calls, data is not lost

 

Root of the problem is that the receiver does not know the length of the byte stream of the sender to be transferred, so the solution stick package is around, how to get the sender before sending the data, byte stream, the total size of their own that will be sent to enable the receiver its end, and then fetched reception cycle of death has received all data

Running speed much faster than the transmission speed of the network, so before sending some bytes prior to the transmission of the byte stream length with send, this embodiment amplifies the performance cost of network latency

2. The method to solve the stick package

Custom stream of bytes plus a fixed length header, the header length included in the byte stream, and then send a pair to the end, remove the fixed length header at the time of start buffer receiving end, and then take the real data

struct module 

This module can be a type, such as numbers, conversion to a fixed length bytes 

>>> struct.pack ( 'I', 1,111,111,111,111) 

. . . . . . . . . 

struct.error: 'i' format requires -2147483648 <= number <= 2147483647 # This is the range 

Import JSON, struct 
# 1T assumed uploaded by the client: the file 1073741824000 a.txt 

# avoid stick package, must be customized header from 
header = { 'file_size': 1073741824000, 'file_name': '/ a / b / c / d / e / a.txt', 'md5': '8f6fbf8347faa4924a76856701edb0f3'} # 1T data, file path and md5 value 

# header for can transmit, and the need to serialize switch bytes 
head_bytes bytes = (json.dumps (header), encoding = 'UTF-. 8') # serialized and converted into bytes, for transmitting a 

# in order to allow the client to know the length of the header, struck by the length of the header to the digital-to-fixed length: 4 bytes 
head_len_bytes = struct.pack ( 'i', len (head_bytes)) # 4 bytes contains only one number which is the length of the header 

# client begins sending 
conn.
 
# server starts to receive 
head_len_bytes = s.recv (4) # first collection header 4 bytes, the header length to obtain byte format 
x = struct.unpack ( 'i', head_len_bytes) [0 ] # extracted header length 

(bytes X format) header in accordance with the length of # x, charge = s.recv header head_bytes 
header = json.loads (json.dumps (header)) extracting the header # 

# Finally, the contents of the extracted real data header, such 
real_data_len = s.recv (header [ 'FILE_SIZE']) 
s.recv (real_data_len)

  

We can make the header dictionary, the dictionary contains details of real data to be transmitted, then json serialization, then struck with a data length after the sequence of bytes packed into four (4 enough to use your own)

Sent when:

First transmitter head length

Re-encoding the header and then transmits the content

Finally, the true content of hair

Receives:

Sente header length, taken out with struct

The content length header charged removed, then decoded, deserialized

Details of data to be taken out from the take deserialize the results and real data fetch content

# Custom stream of bytes plus a fixed length header, the header length included in the byte stream, and to send a peer, 
# peer removed fixed-length header buffer start is received, and then take the real data 

#struct module 
# the module can be of a type, such as numbers, converted into bytes of fixed length 



'' ' 
we can put a header made dictionary, the dictionary contains details real data to be transmitted, then json serialization, 
then struck the data length of the sequence of packed into four bytes (4 themselves enough to use) 
sending: 
first transmitter head length 
re-encoding the header content and then send 
the last sent real content 

receives: 
the upper hand header length, with struct taken out 
the length of the extracted content header charged, then decoded, deserialized 
details of access to data to be extracted from the result deserialized, then fetch the data content of the real 
'' ' 
# >>> struct.pack ( "I" , "ABC") 
# Traceback (MOST Recent Last Call): 
# File "<pyshell. 1 #>", Line. 1, in <Module1> 
# struct.pack ( "I", "ABC") 
# struct.error:IS AN intege argument not required 

# server (slightly more complex custom headers) 

Import socket, struct, json 
Import subprocess


phone=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
ip_sort=("127.0.0.1",8080)
back_log=5
phone.setsockopt(socket.SOL_SOCKET,socket.SO_REUSEADDR,1)
phone.bind(ip_sort)
phone.listen(back_log)

while True:
    conn,addr=phone.accept()
    while True:
        cmd=conn.recv(1024)
        if not cmd:break
        print("cmd: %s" %cmd)
        res=subprocess.Popen(cmd.decode("utf-8"),shell=True,stdout=subprocess.PIPE,stderr=subprocess.PIPE
                             ,stdin=subprocess.PIPE)
        err=res.stderr.read()
        print(err)
        if err:
            back_msg=err
        the else: 
            back_msg = res.stdout.read () 

        headers = { 'Data_Size': len (back_msg)} 
        head_json = json.dumps (headers) # serialized string 
        Print (type (head_json)) 
        head_json_bytes bytes = (head_json, encoding = "UTF-. 8") 


        # struct.pack ( "I" type into converted packets, the second argument must be a number) 
        conn.send (struct.pack ( "I", len (head_json_bytes))) # starting length of the header 
        conn.send (head_json.encode ( "utf-8 ")) # re-transmitter head 
        conn.sendall (back_msg) # recurrent real content 

    conn.Close ()

  

# Method to solve the stick package of client 
Import socket, struct, json 

ip_port = ( " 127.0.0.1 " , 8080 ) 

tcp_client = socket.socket (socket.AF_INET, socket.SOCK_STREAM) 

tcp_client.connect_ex (ip_port) 

the while True: 
    cmd = INPUT ( " >> " )
     IF  Not cmd: Continue 
    tcp_client.send (cmd.encode ( ' UTF-. 8 ' )) 


    Data = tcp_client.recv (. 4) # receiving the message header length 
    NUM = struct.unpack ( " I " , data) [0]# The unpack unpack tuples out a 
    Print (NUM) 
    header = json.loads (tcp_client.recv (NUM) .decode ( " UTF-. 8 " )) # by receiving a header length of the header receiving 
    data_len The header = [ " Data_Size " ] # length of the message gets sent 

    recv_size = 0 
    recv_data = B '' 
    the while recv_size < data_len the: 
        recv_data + = tcp_client.recv (1024 ) 
        recv_size = len (recv_data) 

    Print (recv_data.decode ( " GBK " ))
     #print (recv_data.decode ( "GBK")) #windows default encoding is GBK

 

Guess you like

Origin www.cnblogs.com/tangcode/p/11620151.html
Recommended