036_python large file downloads and progress bar display

review

1 stick pack phenomenon

Stick pack phenomenon causes :

  • Characteristics tcp protocol, stream-oriented, in order to ensure reliable transmission, so there are many optimization mechanism.
  • Borderless no boundaries between all the data transferred on the basis of connection establishment.
  • Send and receive messages probably are not all equal.
  • Caching mechanism, leading to not send the last message buffer at the sending end, not the receiving end of the message buffer at the receiving end.

Solve :

  • Customized to the application layer protocol
    solution: just send a message
  • Transmitting a first fixed length represents a data length of bytes to be transmitted, receiving a first fixed length
  • Retransmission data to be transmitted, then the received data according to the length.
    Pieces of information sent: Solution two
  • Transmitting a first fixed length indicates the length of the dictionary of bytes to be sent, receiving a first fixed length.
  • To send a retransmission dictionary (dictionary file information is stored), then in accordance with the length of the received dictionary.
  • Send the file, and then receives the corresponding content based on the information in the dictionary.

So these two kinds of programs, how to choose?
If before sending the data, custom protocols send more than one variable, you should use the dictionary
. Otherwise, you need to send multiple times, such as: size, file name, file size, length, content file name of the file ...

The more times sent over the network, it is more a waste of time.
If the above five variables, put it inside a dictionary, only sent once on it.

If there is only one variable, we use the first option.

Large file downloads and progress bar display

Large file transfers:
server.py


import os
import json
import socket
import struct
filepath = r'E:\BaiduYunDownload\[电影天堂www.dy2018.com]移动迷宫3:死亡解药BD国英双语中英双字.mp4'
 
sk = socket.socket()
sk.bind(('127.0.0.1',9000))
sk.listen()
 
conn,addr = sk.accept()
filename = os.path.basename(filepath)
filesize = os.path.getsize(filepath)
dic = {'filename':filename,'filesize':filesize}
str_dic = json.dumps(dic).encode('utf-8')
len_dic = len(str_dic)
length = struct.pack('i',len_dic)
conn.send(length)   # dic的长度
conn.send(str_dic)  # dic
with open(filepath,'rb') as f:  # 文件
    while filesize:
        content = f.read(4096)
        conn.send(content)
        filesize -= len(content)
        '''
        这里不能减等4096,因为文件,最后可能只有3字节。
        要根据读取的长度len(content),来计算才是合理的。
        '''
conn.close()
sk.close()

client.py


import json
import struct
import socket
 
sk = socket.socket()
sk.connect(('127.0.0.1',9000))
 
dic_len = sk.recv(4)
dic_len = struct.unpack('i',dic_len)[0]
dic = sk.recv(dic_len)
str_dic = dic.decode('utf-8')
dic = json.loads(str_dic)
with open(dic['filename'],'wb') as f:  # 使用wb更严谨一些,虽然可以使用ab
    while dic['filesize']:
        content = sk.recv(4096)  #这边的4096,可以不用和server对应,改成1024,也可以
        dic['filesize'] -= len(content)
        f.write(content)
sk.close()

First execution server.py, then execute client.py wait more than 30 seconds, the current directory will be a video file, open the confirmation, if you can play.

Bad customer experience, you can not know when the receiving end, there is no program in the end stuck? Download how long it took? Let's do not know a Premium increase download time and progress bar

The main modification is client.py, code is as follows:


import json
import struct
import socket
import sys
import time
 
def processBar(num, total):  # 进度条
    rate = num / total
    rate_num = int(rate * 100)
    if rate_num == 100:
        r = '\r%s>%d%%\n' % ('=' * rate_num, rate_num,)
    else:
        r = '\r%s>%d%%' % ('=' * rate_num, rate_num,)
    sys.stdout.write(r)
    sys.stdout.flush
 
start_time = time.time()  # 开始时间
 
sk = socket.socket()
sk.connect(('127.0.0.1',9000))
 
dic_len = sk.recv(4)
dic_len = struct.unpack('i',dic_len)[0]
dic = sk.recv(dic_len)
str_dic = dic.decode('utf-8')
dic = json.loads(str_dic)
with open(dic['filename'],'wb') as f:  # 使用wb更严谨一些,虽然可以使用ab
    content_size = 0
    while True:
        content = sk.recv(4096)<br>        f.write(content) # 写入文件
        content_size += len(content)  # 接收大小
        processBar(content_size,dic['filesize'])  # 执行进度条函数
        if content_size == dic['filesize']:break  # 当接收的总大小等于文件大小时,终止循环
             
sk.close()  # 关闭连接
 
end_time = time.time()  # 结束时间
print('本次下载花费了{}秒'.format(end_time - start_time))

Implementation of the results is as follows:


1803205-7817ad356e91d198.png
Implementation of the results

The above results show 100 equals sign, too long, so it should be reduced to 1/3?
Modify progress bar function


def processBar(num, total):  # 进度条
    rate = num / total
    rate_num = int(rate * 100)
    if rate_num == 100:
        r = '\r%s>%d%%\n' % ('=' * int(rate_num / 3), rate_num,) # 控制等号输出数量,除以3,表示显示1/3
    else:
        r = '\r%s>%d%%' % ('=' * int(rate_num / 3), rate_num,)
    sys.stdout.write(r)
    sys.stdout.flush

Perform again:


1803205-2711a53856e4ac08.png
Implementation of the results

One more advanced version, green aircraft following code
code is as follows:


def processBar(num, total):  # 进度条
    rate = num / total
    rate_num = int(rate * 100)
    pretty = '✈'
    if rate_num == 100:
        r = '\r\033[32m{}\033[0m{}%\n'.format(pretty * int(rate_num / 5), rate_num,)
    else:
        r = '\r\033[32m{}\033[0m{}%'.format(pretty * int(rate_num / 5), rate_num,)
    sys.stdout.write(r)
    sys.stdout.flush

Results are as follows:


1803205-4b0790a898e8ae3a.png
Implementation of the results

Again a second color change, color change introduced a random class.


import random
 
 
class Prompt(object):  # 提示信息显示
    colour_dic = {
        'red': 31,
        'green': 32,
        'yellow': 33,
        'blue': 34,
        'purple_red': 35,
        'bluish_blue': 36,
        'white': 37,
    }
 
    def __init__(self):
        pass
 
    @staticmethod
    def display(msg, colour='white'):
        choice = Prompt.colour_dic.get(colour)
        # print(choice)
        if choice:
            info = "\033[1;{};1m{}\033[1;0m".format(choice, msg)
            return info
        else:
            return False
 
    def random_color(msg):  # 随机换色
        colour_list = []
        for i in Prompt.colour_dic:
            colour_list.append(i)
 
        length = len(colour_list) - 1  # 最大索引值
        index = random.randint(0, length)  # 随机数
 
        ret = Prompt.display(msg, colour_list[index])  # 随机颜色
        return ret

Modify client.py


from Prompt import Prompt
 
def processBar(num, total):  # 进度条
    rate = num / total
    rate_num = int(rate * 100)
    pretty = Prompt.random_color('✈')  # 随机换色
    if rate_num == 100:
        r = '\r{}{}%\n'.format(pretty * int(rate_num / 5), rate_num,)
    else:
        r = '\r{}{}%'.format(pretty * int(rate_num / 5), rate_num,)
    sys.stdout.write(r)
    sys.stdout.flush

Implementation of the results is as follows:


1803205-20f92ac3e92f024b.png
Implementation of the results

1. Increase MD5 checksum

server.py


import os
import json
import socket
import struct
import hashlib
 
sk = socket.socket()
sk.bind(('127.0.0.1', 9000))
sk.listen()
 
conn, addr = sk.accept()
filename =r '[电影天堂www.dy2018.com]移动迷宫3:死亡解药BD国英双语中英双字.mp4'  # 文件名
absolute_path = os.path.join(r'E:\BaiduYunDownload',filename)  # 文件绝对路径
buffer_size = 1024*1024  # 缓冲大小,这里表示1MB
 
md5obj = hashlib.md5()
with open(absolute_path, 'rb') as f:
    while True:
        content = f.read(buffer_size)  # 每次读取指定字节
        if content:
            md5obj.update(content)
        else:
            break  # 当内容为空时,终止循环
 
md5 = md5obj.hexdigest()
print(md5)  # 打印md5值
 
dic = {'filename':filename,
       'filename_md5':str(md5),'buffer_size':buffer_size,
       'filesize':os.path.getsize(absolute_path)}
str_dic = json.dumps(dic).encode('utf-8')
len_dic = len(str_dic)
length = struct.pack('i', len_dic)
conn.send(length)  # dic的长度
conn.send(str_dic)  # dic
with open(absolute_path, 'rb') as f:  # 文件
    while dic['filesize']:
        content = f.read(dic['buffer_size'])
        conn.send(content)
        dic['filesize'] -= len(content)
        '''
        这里不能减等4096,因为文件,最后可能只有3字节。
        要根据读取的长度len(content),来计算才是合理的。
        '''
conn.close()

client.py


import json
import struct
import socket
import sys
import time
import hashlib
import os
from Prompt import Prompt
 
def processBar(num, total):  # 进度条
    rate = num / total
    rate_num = int(rate * 100)
    pretty = Prompt.random_color('✈')
    if rate_num == 100:
        r = '\r{}{}%\n'.format(pretty * int(rate_num / 5), rate_num,)
    else:
        r = '\r{}{}%'.format(pretty * int(rate_num / 5), rate_num,)
    sys.stdout.write(r)
    sys.stdout.flush
 
start_time = time.time()  # 开始时间
 
sk = socket.socket()
sk.connect(('127.0.0.1',9000))
 
dic_len = sk.recv(4)
dic_len = struct.unpack('i',dic_len)[0]
dic = sk.recv(dic_len)
str_dic = dic.decode('utf-8')
dic = json.loads(str_dic)
 
md5 = hashlib.md5()
with open(dic['filename'],'wb') as f:  # 使用wb更严谨一些,虽然可以使用ab
    content_size = 0
    while True:
        content = sk.recv(dic['buffer_size'])  # 接收指定大小
        f.write(content)  # 写入文件
        content_size += len(content)  # 接收大小
        md5.update(content)  # 摘要
 
        processBar(content_size,dic['filesize'])  # 执行进度条函数
        if content_size == dic['filesize']:break  # 当接收的总大小等于文件大小时,终止循环
 
    md5 = md5.hexdigest()
    print(md5)  # 打印md5值
    if dic['filename_md5'] == str(md5):
        print(Prompt.display('md5校验正确--下载成功','green'))
    else:
        print(Prompt.display('文件验证失败', 'red'))
        os.remove(dic['filename'])  # 删除文件
 
sk.close()  # 关闭连接
 
end_time = time.time()  # 结束时间
print('本次下载花费了{}秒'.format(end_time - start_time))

Perform output:


1803205-7d6d14ba20f15158.png
Implementation of the results

More ways to introduce the socket

1. More ways

Server socket function
s.bind() bind (host, port number) to the socket
s.listen()start TCP listening
s.accept()passively accept TCP client connections (blocking) waiting for connection

The client socket functions
s.connect() active initialize TCP server connections
s.connect_ex()connect () function in the extended version returns an error code if an error occurs, rather than throwing an exception

Public use of socket functions
s.recv() receives TCP data
s.send()transmitted TCP data
s.sendall()transmitting TCP data
s.recvfrom()receiving UDP data
s.sendto()transmitted UDP data
s.getpeername()connection to the remote address of the current socket
s.getsockname()address of the current socket
s.getsockopt()return parameter of the socket
s.setsockopt()disposed specified socket parameter word of
s.close()the socket is closed

The method of lock-facing socket
s.setblocking() provided with a socket blocking non-blocking mode
s.settimeout()setting operation of blocking socket timeout
s.gettimeout()obtained blocking socket timeout operation

Function file-oriented sockets
s.fileno() socket file descriptor
s.makefile()to create a file associated with the socket

2.send method and sendall

Official documents of socket.send () and socket.sendall () in the socket module is explained as follows:

socket.send(string[, flags])
Send data to the socket. The socket must be connected to a remote socket. The optional flags argument has the same meaning as for recv() above. Returns the number of bytes sent. Applications are responsible for checking that all data has been sent; if only some of the data was transmitted, the application needs to attempt delivery of the remaining data.

send () return value is the number of bytes sent, this value may be less than the number of bytes of the string to be transmitted, that is to say may not be sent in all of the data string. If there is an error will be thrown.

socket.sendall(string[, flags])
Send data to the socket. The socket must be connected to a remote socket. The optional flags argument has the same meaning as for recv() above. Unlike send(), this method continues to send data from string until either all data has been sent or an error occurs. None is returned on success. On error, an exception is raised, and there is no way to determine how much data, if any, was successfully sent.

Try to send all the data string of success, it returns None, failure exception is thrown.

Therefore, the code is equivalent to the following two:

#sock.sendall('Hello world\n')

#buffer = 'Hello world\n'
#while buffer:
#    bytes = sock.send(buffer)
#    buffer = buffer[bytes:]

Verify the legitimacy of the client link

Use hashlib.md5Encryption

1803205-f599907808f29170.png
Verify the legitimacy of the client

Why should random string, is to prevent the client data stolen. Generating a random bytes of data types, it is not out of solution.


import os
print(os.urandom(32))

Perform output:
b'PO \ xca8 \ xc8 \ XF3 \ XA0 \ XB5, \ XDD \ xb8K \ xa8D \ x9cN "\ X82 \ X03 \ x86g \ x18e \ XA7 \ X97 \ xa77 \ xb9 \ xa5VA '

server.py


import os
import socket
import hashlib
 
secret_key = '老衲洗头用飘柔'  # 加密key
 
sk = socket.socket()
sk.bind(('127.0.0.1',9000))
sk.listen()
while True:
    try:
        conn,addr = sk.accept()
        random_bytes = os.urandom(32)  # 随即产生32个字节的字符串,返回bytes
        conn.send(random_bytes)  # 发送随机加密key
        md5 = hashlib.md5(secret_key.encode('utf-8'))  # 使用secret_key作为加密盐
        md5.update(random_bytes)  #得到MD5消息摘要
        ret = md5.hexdigest()  #以16进制返回消息摘要,它是一个32位长度的字符串
        msg = conn.recv(1024).decode('utf-8')  # 接收的信息解码
        if msg == ret:print('是合法的客户端')  # 如果接收的摘要和本机计算的摘要一致,就说明是合法的
        else:conn.close()  # 关闭连接
    finally:  # 无论如何,都执行下面的代码
        sk.close()  # 关闭连接
        break

client.py


import socket
import hashlib
secret_key = '老衲洗头用飘柔'  # 加密key
sk = socket.socket()
sk.connect(('127.0.0.1',9000))
 
urandom = sk.recv(32)  # 接收32字节,也就是os.urandom的返回值
md5_obj = hashlib.md5(secret_key.encode('utf-8'))  # 使用加密盐加密
md5_obj.update(urandom)
sk.send(md5_obj.hexdigest().encode('utf-8'))  # 发送md5摘要
print('-----')
sk.close()  # 关闭连接

First execution server.py, then execute client.py
Client output: -----
Server Output: it is a legitimate client

如果100客户端,来连接呢?秘钥都是通用的。
一般情况下,用在哪些场景呢?
比如公司级别,比如1台机器,向100台服务器获取数据

假如黑客渗透到内网,得知到服务器IP地址。就可以做端口扫描,一台计算机的端口范围是0~65535扫描6万多次,就能知道了。

1.使用hmac加密

hmac是专门来做客户端合法性的

import hmac
obj = hmac.new(key=b'secret_key',msg=b'100212002155')
print(obj.hexdigest())

执行输出:
27111d37764a2fe5bc79d297e7b54c35
客户端也使用hmac,验证一下,就可以了。

改造server和client

server.py


import os
import socket
import hmac
 
secret_key = '老衲洗头用飘柔'.encode('utf-8')
sk = socket.socket()
sk.bind(('127.0.0.1',9000))
sk.listen()
while True:
    try:
        conn,addr = sk.accept()
        random_bytes = os.urandom(32)
        conn.send(random_bytes)
        obj = hmac.new(key=secret_key,msg=random_bytes)
        ret = obj.hexdigest()
        msg = conn.recv(1024).decode('utf-8')
        if msg == ret:print('是合法的客户端')
        else:conn.close()
    finally:
        sk.close()
        break

client.py


import socket
import hmac
 
secret_key = '老衲洗头用飘柔'.encode('utf-8')
sk = socket.socket()
sk.connect(('127.0.0.1', 9000))
 
urandom = sk.recv(32)
hmac_obj = hmac.new(key=secret_key, msg=urandom)
sk.send(hmac_obj.hexdigest().encode('utf-8'))
print('-----')
sk.close()

socketserver

SocketServer内部使用 IO多路复用 以及 “多线程” 和 “多进程” ,从而实现并发处理多个客户端请求的Socket服务端。即:每个客户端请求连接到服务器时,Socket服务端都会在服务器是创建一个“线程”或者“进 程” 专门负责处理当前客户端的所有请求。


1803205-a5bdbbc882f2bc6f.png
socketserver

它能实现多个客户端,同时连接,它继承了socket

ThreadingTCPServer实现的Soket服务器内部会为每个client创建一个 “线程”,该线程用来和客户端进行交互。

使用ThreadingTCPServer:
创建一个继承自 SocketServer.BaseRequestHandler 的类,必须继承
类中必须定义一个名称为 handle 的方法,必须重写

看BaseRequestHandler 的源码,它的hendle方法,是空的


def handle(self):
    pass

需要自己去实现
server.py


import socketserver
class MyServer(socketserver.BaseRequestHandler):
    def handle(self):
        print(self.request)
 
server = socketserver.ThreadingTCPServer(('127.0.0.1',9000),MyServer)
server.serve_forever()

client.py


import socket
sk = socket.socket()
sk.connect(('127.0.0.1',9000))
sk.close()

先执行server.py,再执行client.py

server输出
<socket.socket fd=540, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 9000), raddr=('127.0.0.1', 62243)>

每连接一个client.就会触发handle,输出request监听,等待连接,接收,全由hendle完成了。

server.py


import socketserver
class MyServer(socketserver.BaseRequestHandler):
    def handle(self):
        print(self.request)
        self.request.send(b'hello')  # 跟所有的client打招呼
        print(self.request.recv(1024))  # 接收客户端的信息
 
server = socketserver.ThreadingTCPServer(('127.0.0.1',9000),MyServer)
server.serve_forever()

client.py


import socket
sk = socket.socket()
sk.connect(('127.0.0.1',9000))
print(sk.recv(1024))
inp = input('>>>').encode('utf-8')
sk.send(inp)
sk.close()

先执行server.py,再执行client.py

client输出:b'hello'

>>>hiserver输出:<socket.socket fd=316, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 9000), raddr=('127.0.0.1', 49176)> b'hi'

开多个客户端,也可以执行


1803205-5b3aa95cde7bac40.png
Multi-client

and a plurality of client server capable of communication


1803205-ade676bc6738b366.png
A plurality of client communication

1. Continuous transmission

client continuous transmission:


import socket
sk = socket.socket()
sk.connect(('127.0.0.1',9000))
while True:
    print(sk.recv(1024))
    #inp = input('>>>').encode('utf-8')
    sk.send(b'hahaha')
sk.close()

continuous reception server:


import socketserver
class MyServer(socketserver.BaseRequestHandler):
    def handle(self):
        while True:
            print(self.request)
            self.request.send(b'hello')  # 跟所有的client打招呼
            print(self.request.recv(1024))  # 接收客户端的信息
 
server = socketserver.ThreadingTCPServer(('127.0.0.1',9000),MyServer)
server.serve_forever()

Implementation of the results is as follows:


1803205-d1ce1d924c96f166.png
Implementation of the results

If the server port is repeated, using the following code:


# 设置allow_reuse_address允许服务器重用地址
    socketserver.TCPServer.allow_reuse_address = True

The complete code is as follows:


import socketserver
class MyServer(socketserver.BaseRequestHandler):
    def handle(self):
        while True:
            print(self.request)  # 这里不能使用input,否则卡住了
            self.request.send(b'hello')  # 跟所有的client打招呼
            print(self.request.recv(1024))  # 接收客户端的信息
if __name__ == '__main__':
    socketserver.TCPServer.allow_reuse_address = True
    server = socketserver.ThreadingTCPServer(('127.0.0.1',9000),MyServer)
    server.serve_forever()

Tomorrow dictation :

import socketserver
class MyServer(socketserver.BaseRequestHandler):
   def handle(self):
       self.request.send(b'hello')
       msg = self.request.recv(1024)
       print(msg)

if __name__ == '__main__':
   socketserver.TCPServer.allow_reuse_address = True
   server = socketserver.ThreadingTCPServer(('127.0.0.1',9000),MyServer)
   server.serve_forever()

Guess you like

Origin blog.csdn.net/weixin_34148508/article/details/90883490