Programmers enjoy the green unicorn Gunicorn, get to know

wedge

Continuing the above, today I will talk about the first step in the python back-end traffic processing process: receiving data packets. Receiving a data packet generally requires web server to handle, commonly Python web server may be nginx, httpd, WSGI container itself can do web service, taking into account the expansion commonly used, availability, production Nginx + gunicorn ( to uwsgi ), communication General unix socket.

Today's focus is the WSGI container Gunicorn .

image.png

Knowledge preparation

pre-forking

  • Pre means to fork out multiple worker child processes before the requests arrive ;

  • The worker is responsible for processing the request, and the master is responsible for managing the running cycle of the worker .

image.png

WSGI

The full name is Python Web Server Gateway Interface , which specifies a standard interface between a web server and a Python web application or web framework to improve the portability of web applications between a series of web servers.

By definition, the following points:

  • WSGI is a set of interface standard protocols / specifications;

  • The communication (function) interval is between the Web server and the Python Web application;

  • The purpose is to develop standards to ensure that different web servers can communicate with different Python programs, such as nginx+gunicorn+flask, or nginx+uwsgi+django, and so on. Anything that implements the wsgi protocol can be replaced.

socket

In the traditional sense, it is similar to the existence of file descriptors, which is an abstract resource location, that is, process port resources.

1. After a program is started, a pid file identifies the program that locks the running program.
2. After a file is opened, there is an fd in the process to mark it.
3. A process occupies a port and establishes the communication of the related protocol, which is identified by the socket .

So socket is a resource identifier, a triple (protocol, port number, IP address).

Socket-based programming refers to encapsulating the processing of data on the basis of the resource (triple). It is like programming a file after opening a file, such as the function f.Close().

The necessary functions for socket programming are as follows:

· Socket(): Create a socket.

· Bind(): Bind the address, that is, which resource the socket is responsible for.

· Listen(): Start listening.

· Accept(): Accept the request.

· Connect(): establish a connection

· Recv()/Send(): data sending and receiving

Client->Establish socket->Connect() establish connection->Close().

Server->Create socket->Bind()->Listen()->Accept() to receive the request->Close().

What is Gunicorn

Gunicorn Green Unicorn是一个UNIXPython WSGI HTTP服务器。它是一个从RubyUnicorn项目移植过来的预分叉工作器模型。Gunicorn服务器广泛兼容各种web框架,实现简单,服务器资源少

Gunicorn实现了WSGI协议,所以它可以作为web服务器和程序框架之间的纽带,http报文转换为WSGI规定的格式。

gunicorn.png

项目中如何使用

  • 命令行模式

gunicorn -h 有很多参数,读者有兴趣可以细究下,本文主要讲原理实现。

image.png

  • 文件配置形式
[program:gunicorn_demo]
process_name=%(program_name)s
numprocs=1
priority=901
directory = /opt/gunicorn_demo/
command = /opt/virtualenv/bin/python /opt/virtualenv/bin/gunicorn -c gunicorn_demo.py gunicorn_demo:app
autostart = true
startsecs = 20
autorestart = true
startretries = 3
user = root
redirect_stderr = true
stdout_logfile_maxbytes = 20MB
stdout_logfile_backups = 10
stdout_logfile = /dev/null

工作原理

如下图:

gunicorn运行逻辑图,从图中我们可以看到,gunicorn主要做了如下几件事情:

  • 启动框架程序;

  • 接收数据包并转换wsgi格式,传递给框架;

  • 提供并发模型,管理进程.

image.png

gunicorn代码调用,最重要的逻辑在Arbiter类中,arbitor在计算机专业术语中叫 总线控制器,顾名思义,这个类很重要,gunicorn主要功能都围绕它展开。

image.png

gunicorn提供的并发模型(文件路径gunicorn/workers

同步模型Sync Workerssync.py

如下图是 同步worker的流程图,accept时,设置为阻塞模式。

image.png

并发模型如下:

image.png

同步worker一个进程里一个线程,线程按照顺序处理请求,后面的请求需要等待。

同步worker适合在访问量不大、CPU密集而非I/O密集的情形。

好处是一个worker进程crash了,也只会影响一个请求。

异步模型Async Workers(异步workers)

ggevent.py,geventlet.py

异步并发模型如下,异步worker也只有一个线程,但能同时处理不同请求,不会阻塞后面的请求

image.png

异步worker是怎样实现并发,使得一个worker就能同时处理很多请求的呢?

 

Gevent为例,每个请求的连接是一个Greenlet协程。Gevent虽然只有一个线程、同时只能处理一个请求,但是在这个请求的异步事件没准备好、进入IO等待时,能主动yield让出控制权、而不是阻塞其他请求的协程,而是先让其他协程执行,当自己的IO准备好时,事件循环会将它从yield让出控制权的地方,继续恢复执行。

 

这样,Gevent就能在不同请求间不断切换,从而实现并发,以充分利用CPU、减少IO等待。并且,因为切换的Greenlet微线程,它操作的维度是函数,而不是线程/进程,所以来回切换的开销,就没有那么大。一般来说,我们的web app多半属于外部IO密集型(总要访问db、访问第三方服务等等),所以用GunicornGevent异步worker(比如Gevent),就非常合理。而如果你的web appCPU密集型,或者你希望请求之间不要互相影响,那么可以选择Gunicorn的同步worker

这里有三个概念:greenlet,eventlet,gevent。

greenlet是Python的协程实现,可以理解为微线程。

Gevent是一个Python网络函数库,它通过Greenlet协程+libev快速事件循环,实现了异步模型。gevent的猴子补丁读者有兴趣可以了解下。

Tornado Workers

Used in conjunction with Tornado . Tornado is a Python framework and network library that can provide an asynchronous IO non-blocking model to handle long-latency requests.

Tornado is also a well-known framework, and the author also plans to write a series of Tornado source code articles.

 

Multithreaded Workers

A request is processed by a thread, using thread pool technology. I will not expand here, and there will be a series of articles about thread pool and process pool.

image.png

Guess you like

Origin blog.51cto.com/13176208/2677775