Flask 源码解读 --- 从请求到响应的流程

学flask有短时间了,一直想了解源码,最近看了大神的一篇博客分析的很透彻,竟然看懂了.现在也来分析下.

1. 提起Ｆlask, 说下 WSGI：

了解了HTTP协议和HTML文档，我们其实就明白了一个Web应用的本质就是：

浏览器发送一个HTTP请求；
服务器收到请求，生成一个HTML文档；
服务器把HTML文档作为HTTP响应的Body发送给浏览器；
浏览器收到HTTP响应，从HTTP Body取出HTML文档并显示。

所以，最简单的Web应用就是先把HTML用文件保存好，用一个现成的HTTP服务器软件，接收用户请求，从文件中读取HTML，返回。Apache、Nginx、Lighttpd等这些常见的静态服务器就是干这件事情的。

如果要动态生成HTML，就需要把上述步骤自己来实现。不过，接受HTTP请求、解析HTTP请求、发送HTTP响应都是苦力活，如果我们自己来写这些底层代码，还没开始写动态HTML呢，就得花个把月去读HTTP规范。

正确的做法是底层代码由专门的服务器软件实现，我们用Python专注于生成HTML文档。因为我们不希望接触到TCP连接、HTTP原始请求和响应格式，所以，需要一个统一的接口，让我们专心用Python编写Web业务。

这个接口就是WSGI：Web Server Gateway Interface。

2. WSGI具体功能

wsgi可以起到接口作用, 前面对接服务器,后面对接app具体功能

WSGI接口定义非常简单，它只要求Web开发者实现一个函数，就可以响应HTTP请求。我们来看一个最简单的Web版本的“Hello, web!”：

def application(environ, start_response):
    start_response('200 OK', [('Content-Type', 'text/html')])
    return '<h1>Hello, web!</h1>'

上面的application()函数就是符合WSGI标准的一个HTTP处理函数，它接收两个参数：

environ：一个包含所有HTTP请求信息的dict对象；
start_response：一个发送HTTP响应的函数

3. Flask和WSGI

from flask import Flask

app = Flask(__name__)

@app.route('/')
def index():
    return 'Hello World!'

就这样flask实例就生成了

但是当调用app的时候,实际上调用了Flask的__call__方法, 这就是app工作的开始

Flask的__call__源码如下:

class Flask(_PackageBoundObject):

# 中间省略

    def __call__(self, environ, start_response):  # Flask实例的__call__方法
        """Shortcut for :attr:`wsgi_app`."""
        return self.wsgi_app(environ, start_response)

所以当请求发送过来时,调用了__call__方法, 但实际上可以看到调用的是wsgi_app方法,同时传入参数 environ和sart_response

来看下wsgi_app怎么定义的:

    def wsgi_app(self, environ, start_response):
        """The actual WSGI application.  This is not implemented in
        `__call__` so that middlewares can be applied without losing a
        reference to the class.  So instead of doing this::

            app = MyMiddleware(app)

        It's a better idea to do this instead::

            app.wsgi_app = MyMiddleware(app.wsgi_app)

        Then you still have the original application object around and
        can continue to call methods on it.

        .. versionchanged:: 0.7
           The behavior of the before and after request callbacks was changed
           under error conditions and a new callback was added that will
           always execute at the end of the request, independent on if an
           error occurred or not.  See :ref:`callbacks-and-errors`.

        :param environ: a WSGI environment
        :param start_response: a callable accepting a status code,
                               a list of headers and an optional
                               exception context to start the response
        """
        ctx = self.request_context(environ)
        ctx.push()
        error = None
        try:
            try:
                response = self.full_dispatch_request()
            except Exception as e:
                error = e
                response = self.handle_exception(e)
            return response(environ, start_response)
        finally:
            if self.should_ignore_error(error):
                error = None
            ctx.auto_pop(error)

第一步, 生成request请求对象和请求上下文环境 :

可以看到ctx=self.request_context(environ), 设计到了请求上下文和应用上下文的概念, 结构为栈结构,拥有栈的特点.

简单理解为生成了一个request请求对象以及包含请求信息在内的request_context

第二步, 请求预处理, 错误处理以及请求到响应的过程:

response = self.full_dispatch_request()

响应被赋值了成full_dispatch_request(), 所以看下full_dispatch_request()方法

    def full_dispatch_request(self):
        """Dispatches the request and on top of that performs request
        pre and postprocessing as well as HTTP exception catching and
        error handling.

        .. versionadded:: 0.7
        """
        self.try_trigger_before_first_request_functions()  # 进行请求前的一些处理, 类似中关键
        try:
            request_started.send(self)  #  socket的操作
            rv = self.preprocess_request()  # 进行请求预处理
            if rv is None:
                rv = self.dispatch_request()
        except Exception as e:
            rv = self.handle_user_exception(e)
        return self.finalize_request(rv)

先看下try_trigger_before_first_request_functions(), 最终目的是将_got_first_request设为True, 如果是True,就开始处理请求了

    def try_trigger_before_first_request_functions(self):
        """Called before each request and will ensure that it triggers
        the :attr:`before_first_request_funcs` and only exactly once per
        application instance (which means process usually).

        :internal:
        """
        if self._got_first_request:
            return
        with self._before_request_lock:
            if self._got_first_request:
                return
            for func in self.before_first_request_funcs:
                func()
            self._got_first_request = True

got_first_request()定义为静态方法, 定义中可以看到if the application started, this attribute is set to True.

class Flask(_PackageBoundObject):
#  省略...
 
    @property
    def got_first_request(self):
        """This attribute is set to ``True`` if the application started
        handling the first request.

        .. versionadded:: 0.8
        """
        return self._got_first_request

回到full_dispatch_request(), 看一下preprocess_request()方法, 也就是flask的钩子,相当于中间键, 可以实现before_request功能.

        try:
            request_started.send(self)
            rv = self.preprocess_request()
            if rv is None:
                rv = self.dispatch_request()
        except Exception as e:
            rv = self.handle_user_exception(e)

对于dispatch_request()方法, 起到分发请求的作用, 一个请求通过url寄来以后,app怎么知道如何响应呢?就是通过这个方法.

第三步, 请求分发 dispatch_request :

    def dispatch_request(self):
        """Does the request dispatching.  Matches the URL and returns the
        return value of the view or error handler.  This does not have to
        be a response object.  In order to convert the return value to a
        proper response object, call :func:`make_response`.

        .. versionchanged:: 0.7
           This no longer does the exception handling, this code was
           moved to the new :meth:`full_dispatch_request`.
        """
        req = _request_ctx_stack.top.request  # 将栈环境中的请求复制给req
        if req.routing_exception is not None:
            self.raise_routing_exception(req)
        rule = req.url_rule
        # if we provide automatic options for this URL and the
        # request came with the OPTIONS method, reply automatically
        if getattr(rule, 'provide_automatic_options', False) \
           and req.method == 'OPTIONS':
            return self.make_default_options_response()
        # otherwise dispatch to the handler for that endpoint
        return self.view_functions[rule.endpoint](**req.view_args)

这一步主要作用就是将@app.route('/')中的'/'和index函数对应起来,具体分析还是挺麻烦的,至少我没搞懂.

接下来full_dispatch_request()通过make_response()将rv生成响应, 赋值给response.

那make_response()是如何做到的呢, 看源码:

    def make_response(self, rv):
        """Converts the return value from a view function to a real
        response object that is an instance of :attr:`response_class`.

        The following types are allowed for `rv`:

        .. tabularcolumns:: |p{3.5cm}|p{9.5cm}|

        ======================= ===========================================
        :attr:`response_class`  the object is returned unchanged
        :class:`str`            a response object is created with the
                                string as body
        :class:`unicode`        a response object is created with the
                                string encoded to utf-8 as body
        a WSGI function         the function is called as WSGI application
                                and buffered as response object
        :class:`tuple`          A tuple in the form ``(response, status,
                                headers)`` or ``(response, headers)``
                                where `response` is any of the
                                types defined here, `status` is a string
                                or an integer and `headers` is a list or
                                a dictionary with header values.
        ======================= ===========================================

        :param rv: the return value from the view function

        .. versionchanged:: 0.9
           Previously a tuple was interpreted as the arguments for the
           response object.
        """
        status_or_headers = headers = None
        if isinstance(rv, tuple):
            rv, status_or_headers, headers = rv + (None,) * (3 - len(rv))

        if rv is None:
            raise ValueError('View function did not return a response')

        if isinstance(status_or_headers, (dict, list)):
            headers, status_or_headers = status_or_headers, None

        if not isinstance(rv, self.response_class):
            # When we create a response object directly, we let the constructor
            # set the headers and status.  We do this because there can be
            # some extra logic involved when creating these objects with
            # specific values (like default content type selection).
            if isinstance(rv, (text_type, bytes, bytearray)):
                rv = self.response_class(rv, headers=headers,
                                         status=status_or_headers)
                headers = status_or_headers = None
            else:
                rv = self.response_class.force_type(rv, request.environ)

        if status_or_headers is not None:
            if isinstance(status_or_headers, string_types):
                rv.status = status_or_headers
            else:
                rv.status_code = status_or_headers
        if headers:
            rv.headers.extend(headers)

        return rv

第四步, 返回到wsgi_app内部:

    def wsgi_app(self, environ, start_response):
        """The actual WSGI application.  This is not implemented in
        `__call__` so that middlewares can be applied without losing a
        reference to the class.  So instead of doing this::

            app = MyMiddleware(app)

        It's a better idea to do this instead::

            app.wsgi_app = MyMiddleware(app.wsgi_app)

        Then you still have the original application object around and
        can continue to call methods on it.

        .. versionchanged:: 0.7
           The behavior of the before and after request callbacks was changed
           under error conditions and a new callback was added that will
           always execute at the end of the request, independent on if an
           error occurred or not.  See :ref:`callbacks-and-errors`.

        :param environ: a WSGI environment
        :param start_response: a callable accepting a status code,
                               a list of headers and an optional
                               exception context to start the response
        """
        ctx = self.request_context(environ)
        ctx.push()
        error = None
        try:
            try:
                response = self.full_dispatch_request()
            except Exception as e:
                error = e
                response = self.handle_exception(e)
            return response(environ, start_response)
        finally:
            if self.should_ignore_error(error):
                error = None
            ctx.auto_pop(error)

就这样response从full_dispatch_request()中得到后, 传入参数environ和start_response, 返回给Gunicorn了.

从HTTP request到response的流程就完毕了.

梳理下流程:

客户端-----> wsgi server ----> 通过__call__调用 wsgi_app, 生成requests对象和上下文环境------> full_dispatch_request功能 ---->通过 dispatch_requests进行url到view function的逻辑转发, 并取得返回值 ------> 通过make_response函数,将一个view_function的返回值转换成一个response_class对象------->通过向response对象传入environ和start_response参数, 将最终响应返回给服务器.

一个人看完源码真的不容易, 没有点功底确实难做到, 但是坚持下来是不是理解就更深了? 加油, 送给每一位看到这里的程序猿们...共勉

Flask 源码解读 --- 从请求到响应的流程

猜你喜欢