1. Socket principle

After learning the knowledge of the boss, simply take some notes
https://www.jianshu.com/p/066d99da7cbd
http://c.biancheng.net/view/2351.html

1.1 What is Socket

In the field of computer communication, socket is translated as "socket", which is a convention or a way to communicate between computers. Through the agreement of socket, a computer can receive data
　　from other computers, and can also send data to other computers. > Read and write write/read –> close” mode to operate.
　　My understanding is that Socket is an implementation of this mode: that is, socket is a special file, and some socket functions are operations on it (read/write IO, open, close).
　　The Socket() function returns an integer Socket descriptor, and subsequent operations such as connection establishment and data transmission are all implemented through the Socket.

Baidu Encyclopedia: A socket (socket) is an abstraction layer through which an application can send or receive data, and it can be opened, read, written, and closed like a file. Sockets allow applications to insert I/O into the network and communicate with other applications on the network. A network socket is a combination of an IP address and a port

1.2 sockets in Unix/Lunix

In the UNIX/Linux system, in order to unify the operation of various hardware and simplify the interface, different hardware devices are also regarded as a file. Operations on these files are equivalent to operations on ordinary files on disk.

You may have heard many experts say that everything in UNIX/Linux is a file! That guy was right.

In order to represent and distinguish the opened files, UNIX/Linux will assign an ID to each file. This ID is an integer and is called a file descriptor (File Descriptor). For example:
0 is usually used to indicate the standard input file (stdin), and its corresponding hardware device is the keyboard;
usually 1 is used to indicate the standard output file (stdout), and its corresponding hardware device is the monitor.

When a UNIX/Linux program performs any form of I/O, it reads from or writes to a file descriptor. A file descriptor is just an integer associated with an open file, and behind it may be an ordinary file on the hard disk, a FIFO, a pipe, a terminal, a keyboard, a monitor, or even a network connection.

Note that a network connection is also a file, and it has file descriptors too! You must understand this sentence.

We can use the socket() function to create a network connection, or open a network file, and the return value of socket() is the file descriptor. With the file descriptor, we can use common file operation functions to transfer data, for example:
use read() to read data from a remote computer;
use write() to write data to a remote computer.

You see, as long as you use socket() to create a connection, the rest is file operations. Network programming is so simple!

1.3 sockets in Windows

Windows also has a concept similar to "file descriptors", but they are usually called "file handles". Therefore, this tutorial will use "handle" if it refers to the Windows platform, and "descriptor" if it refers to the Linux platform.

Unlike UNIX/Linux, Windows will distinguish between sockets and files, and Windows treats sockets as a network connection, so it is necessary to call data transfer functions specially designed for sockets, and the input and output functions for ordinary files are invalid

2. How do processes communicate in the network

2.1, local inter-process communication

a. Message passing (pipeline, message queue, FIFO)
　　b. Synchronization (mutex, condition variable, read-write lock, file and write record lock, semaphore)
　　c. Shared memory (anonymous and named, eg: channel )
　　d. Remote Procedure Call (RPC)

2.2 How do processes communicate in the network

To understand how processes communicate in the network, we have to solve two problems:
　　a. How do we identify a host, that is, how to determine which host the process we are going to communicate runs on.
　　b. How do we identify a unique process, locally through pid identification, how should we identify it in the network?
Solution:
　　a. The TCP/IP protocol family has helped us solve this problem. The "ip address" of the network layer can uniquely identify the host in the network b. The "
　　protocol + port" of the transport layer can uniquely identify the application program in the host (process), therefore, we can use the triplet (ip address, protocol, port) to identify the process of the network, and the process communication in the network can use this mark to interact with other processes

3. Socket communication classification

The triplet [ip address, protocol, port] can be used between processes in the network to communicate between networks. Socket is a middleware tool that uses triplets to solve network communication. For now, almost all applications use Socket, there are two commonly used data transmission methods for Socket communication:
a. SOCK_STREAM: corresponds to the TCP protocol, indicating a connection-oriented data transmission method. Data can reach another computer without error, and if damaged or lost, can be resent, but relatively slowly. The common http protocol uses SOCK_STREAM to transmit data, because it is necessary to ensure the correctness of the data, otherwise the webpage cannot be parsed normally.
　　b. SOCK_DGRAM: Corresponds to the UDP protocol, indicating a connectionless data transmission method. The computer only transmits the data and does not check the data. If the data is damaged during transmission or does not reach another computer, there is no way to remedy it. In other words, if the data is wrong, it is wrong and cannot be retransmitted. Because SOCK_DGRAM does less verification work, it is more efficient than SOCK_STREAM.
　　For example: QQ video chat and voice chat use SOCK_DGRAM to transmit data, because the efficiency of communication must be ensured first, and the delay should be minimized, while the correctness of data is secondary. Even if a small part of data is lost, video and audio can still Normal analysis, noise or noise at most, will not have a substantial impact on communication quality
　　
Socket programming is based on TCP and UDP protocols, and their hierarchical relationship is shown in the figure below:
insert image description here

Diagram Socket() function

insert image description here

4. Three-way handshake for Socket (TCP) connection establishment

The functions of Socket are simplified to three: establish connection, send data and receive data, the following link is the process of establishing connection
http://c.biancheng.net/view/2351.html

Socket buffer

After each socket is created, two buffers are allocated, an input buffer and an output buffer.

write()/send() does not transmit data to the network immediately, but writes the data into the buffer first, and then sends the data from the buffer to the target machine by the TCP protocol. Once the data is written to the buffer, the function can return successfully, regardless of whether they reached the target machine or not, and regardless of when they were sent to the network, these are the things that the TCP protocol takes care of.

The TCP protocol is independent of the write()/send() function. The data may be sent to the network as soon as it is written into the buffer, or it may continue to backlog in the buffer, and the data written multiple times will be insert image description here
sent This depends on many factors such as the network situation at the time, whether the current thread is idle, etc., and is not controlled by the programmer.

The same goes for the read()/recv() functions, which also read data from the input buffer instead of directly from the network

These I/O buffer properties can be organized as follows:

The I/O buffer exists separately in each TCP socket;
The I/O buffer is automatically generated when the socket is created;
Continues to transmit data left in the output buffer even if the socket is closed;
Closing the socket will lose the data in the input buffer.

Under normal circumstances, it does not matter the size of the default buffer, but you can also use the following code to view and modify

sock.getsockopt()        # 获取缓冲区大小
sock.setsockopt()         #更改缓冲区大小

Send and receive function characteristics:

recv features:

If the other end of the established link is disconnected, recv returns an empty string immediately
recv is to take the content from the receiving buffer, and block when the buffer is empty
If recv cannot accept the contents of the buffer at one time, it will automatically accept the next execution

send features:

If the other end of the sending does not exist, a Pipe Broken exception will be generated
send is to send content from the send buffer, and block when the buffer is full

This is the blocking mode of TCP sockets. The so-called blocking means that the previous action is not completed, and the next action will be suspended until the previous action is completed, so as to maintain synchronization. By default, TCP sockets are in blocking mode, which is also the most commonly used.

Sticky packet problem of TCP protocol

In the transfer process of socket buffer and data, it can be seen that the receiving and sending of data are irrelevant. The read()/recv() function will receive as much data as possible no matter how many times the data is sent. That is, the execution times of read()/recv() and write()/send() may be different.
For example, write()/send() is repeated three times, sending the string "abc" each time, then read()/recv() on the target machine may receive "abc" three times, each time receiving "abc"; Receive twice, the first time to receive "abcab", the second time to receive "cabc"; it is also possible to receive the string "abcabcabc" once.

Suppose we want the client to send a student’s student ID every time, and let the server return the student’s name, address, grades and other information. At this time, there may be problems, and the server cannot distinguish the student’s student ID. For example, if 1 is sent for the first time and 3 is sent for the second time, the server may treat it as 13, and the returned information is obviously wrong.

This is the problem of "sticky packets" of data. Multiple data packets sent by the client are received as one data packet. Also known as the unboundedness of data, the read()/recv() function does not know the start or end marker of the data packet (in fact, there is no start or end marker), and only treats them as a continuous data stream.

Really close the socket

The close method can release the resources of a connection, but not immediately. If you want to release immediately, please use the shutdown method before close. The shutdown method
is used to realize the communication mode. There are three modes. SHUT_RD closes the receiving message channel, and SHUT_WR closes Send message channel, SHUT_RDWR Both channels are closed
That is to say, if you want to close a connection, first close all channels, and then connect in release. The above three static variables correspond to digital constants: 0, 1, 2

self.tcpClient.shutdown(2)      #关闭消息发送通道
self.tcpClient.close()            #关闭套接字连接

Python socket programming example 1 - text transmission

client:

#port = str(input('please input sever port:'))
host = '192.168.2.107'    #客户端连接到服务器的ip
port = 5270  #端口
sever_address = (host, port)     #元组定义服务器地址，用于作为socket.connect()函数参数 连接到服务器
text_client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)     #创建一个socket对象为text_client

text_client.connect(sever_address)       #连接到服务器
succeed_flag='cok'                       #设置消息发送成功标志
text = 'connect succeed'
while True :
    try:
        text_client.send(text.encode())  # 发送文本数据，用 encode() 方法将str变为byte
        text = input('please input the message')
        receive_text = text_client.recv(1024).decode()
        print(receive_text)

    finally:
        print('send over!')

server:

import socket

#可以手动输入本机ip地址，若有多个网口，服务器想从那个网口接收数据，就输入那个网口的ip
#hostname = socket.gethostname()  #可以用 .gethostname()函数来自动得到主机ip，不用手动输入了
#host = socket.gethostbyname(hostname)
host = '192.168.2.107'    #客户端连接到服务器的ip
port = 5270  #端口
sever_address = (host, port)     #创建元组作为 socket.bind()函数的输入，

text_sever = socket.socket(socket.AF_INET, socket.SOCK_STREAM)     #创建一个socket对象为text_sever 为服务器
text_sever.bind(sever_address)    #.bind() 函数绑定端口，该服务器监听此端口
text_sever.listen(4)         #开启监听，同时接入数量最多为4

succeed_flag = 'sok'
while True :
    try:
        print(host)
        print('waiting connect')
        text_client_socket, text_client_address = text_sever.accept()              #accept() 函数，堵塞等待client的连接，连接到后才会执行下一条语句
        print(text_client_address[0] + 'is connected!')

        while True :
            receive_text = text_client_socket.recv(1024)  .decode()            #接收client发送的数据，数据最大为1024 ；此处可以看出接收用户数据测试
            print(receive_text)
            text_client_socket.send(succeed_flag.encode())                     #发送给client ok ，反馈自己确实接收到数据

    finally:
        print('work over!')

Python socket programming example 2 - video transmission