How to build a simple socket with a Web Server

background

Modern social networking applications everywhere, whether we are browsing the Web, sending e-mail or online games are inseparable from the network applications, network programming is becoming increasingly important

aims

The core idea of ​​understanding web server, and then build yourself a tiny web server, it can provide us with a simple static pages

final effect

Complete the code examples can be viewed here

Show results

How to Run

python3 index.py

note

We assume that you have learned Python system IO, network programming, Http agreement, if're not familiar with, you can click here is learning Python tutorial, you can click here is learning Http protocol, based on stories written in Python 3.7.2.

TinyWeb achieve

First, we give TinyWebServer primary structure

import socket

# 创建socket
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# 绑定地址和端口
server.bind(("127.0.0.1", 3000))
server.listen(5)

while True:
    # 等待客户端请求
    client, addr = server.accept()
    # 处理请求
    process_request(client, addr)

The above is the core logic of the code waits for the client socket request upon receipt of the client request is a processing request.

Next we process_request main job is to achieve a function, we know Http protocol, Http request comprises a main portion 4 request line, a request header, a blank line, the request body, so we can process_request procedural abstraction as follows:

Read request line ---> --- read request header> read request body ---> --- Request Processing> off request

Specific Python code as follows:

def process_request(client, addr):
    try:
        # 获取请求行
        request_line = read_request_line(client)
        # 获取请求头
        request_headers = read_request_headers(client)
        # 获取请求体
        request_body = read_request_body(
            client, request_headers[b"content-length"])
        # 处理客户端请求
        do_it(client, request_line, request_headers, request_body)
    except BaseException as error:
        # 错误处理
        handle_error(client, error)
    finally:
        # 关闭客户端请求
        client.close()

Why we do not resolve alone blank lines because the blank line is used to represent the entire end of the first http request, in addition to the blank line no effect on us, on how to resolve Http news, first of all let's look at Http news structure:

http message structure

We can see from the above configuration message, to parse http message, which is a critical step in the reading lines from a socket, we can continue to be read until it encounters \ r \ n, so that we can from the socket read complete row

def read_line(socket):
    recv_buffer = b''
    while True:
        recv_buffer += recv(socket, 1)
        if recv_buffer.endswith(b"\r\n"):
            break
    return recv_buffer

The above is only one package socket.recv recv, the specific code as follows:

def recv(socket, count):
    if count > 0:
        recv_buffer = socket.recv(count)
        if recv_buffer == b"":
            raise TinyWebException("socket.rect调用失败!")
        return recv_buffer
    return b""

In the above package, we mainly deal with the error and returns socket.recv abnormal count is less than 0, then we define a TinyWebException own to represent our mistakes, TinyWebException code is as follows:

class TinyWebException(BaseException):
    pass

Resolution request line:

Parsing the request line from the above structure as long as we know the request data is read from the first row, and then separate them by a space on it, the specific code as follows:

def read_request_line(socket):
    """
    读取http请求行
    """
    # 读取行并把\r\n替换成空字符,最后以空格分离
    values = read_line(socket).replace(b"\r\n", b"").split(b" ")
    return dict({
        # 请求方法
        b'method': values[0],
        # 请求路径
        b'path': values[1],
        # 协议版本
        b'protocol': values[2]
    })

Parse request headers:

Resolution request header is slightly more complicated, it has to constantly have to read the line, until it encounters a separate \ end r \ n lines, the specific code is as follows:

def read_request_headers(socket):
    """
    读取http请求头
    """
    headers = dict()
    line = read_line(socket)
    while line != b"\r\n":
        keyValuePair = line.replace(b"\r\n", b"").split(b": ")
        # 统一header中的可以为小写,方便后面使用
        keyValuePair[0] = keyValuePair[0].decode(
            encoding="utf-8").lower().encode("utf-8")
        if keyValuePair[0] == b"content-length":
            # 如果是cotent-length我们需要把结果转化为整数,方便后面读取body
            headers[keyValuePair[0]] = bytesToInt(keyValuePair[1])
        else:
            headers[keyValuePair[0]] = keyValuePair[1]
        line = read_line(socket)
    # 如果heander中没有content-length,我们就手动把cotent-length设置为0
    if not headers.__contains__(b"content-length"):
        headers[b"content-length"] = 0
    return headers

Parses the request body:

Read request body is relatively simple, as long as a continuous reading conetnt-length bytes

def read_request_body(socket, content_length):
    """
    读取http请求体
    """
    return recv(socket, content_length)

After completion of the do_it Http parsing data we need to implement the core, which is based mainly on data processing Http request, we said above, tiny web server is to achieve a static resource to read, read resource we need to locate resources, locate resources is mainly based on the path, and when parsing path, we used the urlparse urllib.parse function modules, as long as we resolve to specific resources, we output response directly to the browser on it. Before the output of a particular code, we simply illustrates the format of a message Http response, HTTP response also consists of four parts, namely: the status line, a simple example is given the message header, and blank lines in response to the body, the following:
Http Response

def do_it(socket, request_line, request_headers, request_body):
    """
    处理http请求
    """
    # 生成静态资源的目标地址,在这里我们所有的静态文件都统一放在static目录下面
    parse_result = urlparse(request_line[b"path"])
    current_dir = os.path.dirname(os.path.realpath(__file__))
    file_path = os.path.join(current_dir, "static" +
                             parse_result.path.decode(encoding="utf-8"))

    # 如果静态资源存在就向客户端提供静态文件
    if os.path.exists(file_path):
        serve_static(socket, file_path)
    else:
        # 静态文件不存在,向客户展示404页面
        serve_static(socket, os.path.join(current_dir, "static/404.html"))

The core logic do_it is serve_static, serve_static mainly achieved static file read and returns Htt response format to the client, the following are the main code of serve_static

def serve_static(socket, path):
    # 检查是否有path读的权限和具体path对应的资源是否是文件
    if os.access(path, os.R_OK) and os.path.isfile(path):
        # 文件类型
        content_type = static_type(path)
        # 文件大小
        content_length = os.stat(path).st_size
        # 拼装Http响应
        response_headers = b"HTTP/1.0 200 OK\r\n"
        response_headers += b"Server: Tiny Web Server\r\n"
        response_headers += b"Connection: close\r\n"
        response_headers += b"Content-Type: " + content_type + b"\r\n"
        response_headers += b"Content-Length: %d\r\n" % content_length
        response_headers += b"\r\n"
        # 发送http响应头
        socket.send(response_headers)
        # 以二进制的方式读取文件
        with open(path, "rb") as f:
            # 发送http消息体
            socket.send(f.read())
    else:
        raise TinyWebException("没有访问权限")

In serve_static the first thing we need to determine whether we have the document read full, and we specified resource is a file, not a folder, if not a legal document us directly tips do not have access, we need until the file format, because the client need to decide how to handle the resources, then we need to file size, used to determine the content-length, mainly through the file format suffix simply judged by the content-type, we provide static_type alone to generate content-type, as long as the file size by Python's os.stat it can get, and finally we just put all the information assembled into Http Response on it.

def static_type(path):
    if path.endswith(".html"):
        return b"text/html; charset=UTF-8"
    elif path.endswith(".png"):
        return b"image/png; charset=UTF-8"
    elif path.endswith(".jpg"):
        return b"image/jpg; charset=UTF-8"
    elif path.endswith(".jpeg"):
        return b"image/jpeg; charset=UTF-8"
    elif path.endswith(".gif"):
        return b"image/gif; charset=UTF-8"
    elif path.endswith(".js"):
        return b"application/javascript; charset=UTF-8"
    elif path.endswith(".css"):
        return b"text/css; charset=UTF-8"
    else:
        return b"text/plain; charset=UTF-8"

Complete tiny web server Code


#!/usr/bin/env python3
# -*- coding: UTF-8 -*-

import socket
from urllib.parse import urlparse
import os

class TinyWebException(BaseException):
    pass

def recv(socket, count):
    if count > 0:
        recv_buffer = socket.recv(count)
        if recv_buffer == b"":
            raise TinyWebException("socket.rect调用失败!")
        return recv_buffer
    return b""

def read_line(socket):
    recv_buffer = b''
    while True:
        recv_buffer += recv(socket, 1)
        if recv_buffer.endswith(b"\r\n"):
            break
    return recv_buffer

def read_request_line(socket):
    """
    读取http请求行
    """
    # 读取行并把\r\n替换成空字符,最后以空格分离
    values = read_line(socket).replace(b"\r\n", b"").split(b" ")
    return dict({
        # 请求方法
        b'method': values[0],
        # 请求路径
        b'path': values[1],
        # 协议版本
        b'protocol': values[2]
    })

def bytesToInt(bs):
    """
    把bytes转化为int
    """
    return int(bs.decode(encoding="utf-8"))

def read_request_headers(socket):
    """
    读取http请求头
    """
    headers = dict()
    line = read_line(socket)
    while line != b"\r\n":
        keyValuePair = line.replace(b"\r\n", b"").split(b": ")
        # 统一header中的可以为小写,方便后面使用
        keyValuePair[0] = keyValuePair[0].decode(
            encoding="utf-8").lower().encode("utf-8")
        if keyValuePair[0] == b"content-length":
            # 如果是cotent-length我们需要把结果转化为整数,方便后面读取body
            headers[keyValuePair[0]] = bytesToInt(keyValuePair[1])
        else:
            headers[keyValuePair[0]] = keyValuePair[1]
        line = read_line(socket)
    # 如果heander中没有content-length,我们就手动把cotent-length设置为0
    if not headers.__contains__(b"content-length"):
        headers[b"content-length"] = 0
    return headers

def read_request_body(socket, content_length):
    """
    读取http请求体
    """
    return recv(socket, content_length)

def send_response():
    print("send response")

def static_type(path):
    if path.endswith(".html"):
        return b"text/html; charset=UTF-8"
    elif path.endswith(".png"):
        return b"image/png; charset=UTF-8"
    elif path.endswith(".jpg"):
        return b"image/jpg; charset=UTF-8"
    elif path.endswith(".jpeg"):
        return b"image/jpeg; charset=UTF-8"
    elif path.endswith(".gif"):
        return b"image/gif; charset=UTF-8"
    elif path.endswith(".js"):
        return b"application/javascript; charset=UTF-8"
    elif path.endswith(".css"):
        return b"text/css; charset=UTF-8"
    else:
        return b"text/plain; charset=UTF-8"

def serve_static(socket, path):
    # 检查是否有path读的权限和具体path对应的资源是否是文件
    if os.access(path, os.R_OK) and os.path.isfile(path):
        # 文件类型
        content_type = static_type(path)
        # 文件大小
        content_length = os.stat(path).st_size
        # 拼装Http响应
        response_headers = b"HTTP/1.0 200 OK\r\n"
        response_headers += b"Server: Tiny Web Server\r\n"
        response_headers += b"Connection: close\r\n"
        response_headers += b"Content-Type: " + content_type + b"\r\n"
        response_headers += b"Content-Length: %d\r\n" % content_length
        response_headers += b"\r\n"
        # 发送http响应头
        socket.send(response_headers)
        # 以二进制的方式读取文件
        with open(path, "rb") as f:
            # 发送http消息体
            socket.send(f.read())
    else:
        raise TinyWebException("没有访问权限")

def do_it(socket, request_line, request_headers, request_body):
    """
    处理http请求
    """

    # 生成静态资源的目标地址,在这里我们所有的静态文件都统一放在static目录下面
    parse_result = urlparse(request_line[b"path"])
    current_dir = os.path.dirname(os.path.realpath(__file__))
    file_path = os.path.join(current_dir, "static" +
                             parse_result.path.decode(encoding="utf-8"))

    # 如果静态资源存在就向客户端提供静态文件
    if os.path.exists(file_path):
        serve_static(socket, file_path)
    else:
        # 静态文件不存在,向客户展示404页面
        serve_static(socket, os.path.join(current_dir, "static/404.html"))

def handle_error(socket, error):
    print(error)
    error_message = str(error).encode("utf-8")
    response = b"HTTP/1.0 500 Server Internal Error\r\n"
    response += b"Server: Tiny Web Server\r\n"
    response += b"Connection: close\r\n"
    response += b"Content-Type: text/html; charset=UTF-8\r\n"
    response += b"Content-Length: %d\r\n" % len(error_message)
    response += b"\r\n"
    response += error_message
    socket.send(response)

def process_request(client, addr):
    try:
        # 获取请求行
        request_line = read_request_line(client)
        # 获取请求头
        request_headers = read_request_headers(client)
        # 获取请求体
        request_body = read_request_body(
            client, request_headers[b"content-length"])
        # 处理客户端请求
        do_it(client, request_line, request_headers, request_body)
    except BaseException as error:
        # 打印错误信息
        handle_error(client, error)
    finally:
        # 关闭客户端请求
        client.close()

server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.bind(("127.0.0.1", 3000))
server.listen(5)

print("启动tiny web server,port = 3000")

while True:
    client, addr = server.accept()
    print("请求地址:%s" % str(addr))
    # 处理请求
    process_request(client, addr)

Last point

The above tiny web server just to achieve a very simple function, which is much more complex than in the actual application, here is just reflecting the core idea of ​​the web server

Guess you like

Origin blog.51cto.com/14378833/2404932