Network Protocol TCP / UDP / HTTP regular exam
Browser and enter a url through the middle of the process
What comes to the middle of the process
What network protocols include
- What each protocol have done?
- DNS query -> TCP handshake -> HTTP Request -> reverse proxy Nginx-> uwsgi / gunicom-> web app response -> TCP waving
TCP three-way handshake
TCP three-way handshake, state transition. With wireshark capture more intuitive
TCP four waved process
TCP fourth wave, the state installed for
client state C S server state
Difference of TCP / UDP
TCP vs UDP
Connection-oriented, reliable, byte-
Connectionless unreliable, message-oriented
to sum up
Memory contents, memory interval, repeatedly, retrieving learning, interspersed with practice
Feynman study skills, using simple language to teach others, test whether they truly understand
If the interpretation encountered obstacles, re-learning knowledge points (divide and conquer)
HTTP protocol often questions
The composition of the HTTP request
HTTP protocol from which parts?
State line
Request header
Message Body
The composition of the HTTP response
HTTP protocol from which parts?
State line
Response header
The response body
Common HTTP status codes
1 ** information server receives the request, the requester needs to continue operation
2 ** successful, the operation was successfully accepted and processed
3 ** redirection, complete the request need further manipulation
4 ** client error, request a syntax error or can not fulfill the request
5 ** server error, an error occurs during the processing of the server request
Meaning (220, 301, 302, 400, 403, 500, etc.) bearing in mind the common status code
HTTP GET / POST the difference?
Common HTTP methods: GET / POST / PUT / DELETE
GET get update POST PUT DELETE delete create
Restful is to obtain a semantically, is to create a
GET is idempotent, POST and other non-power
GET request parameters into the url (plain text), the length limit, POST request body into safer
What is Idempotence
Which HTTP method is idempotent
Idempotent method is invoked regardless of how many times the same result HTTP methods
eg: a = 4 is idempotent, but a + = 4 non-idempotent
Idempotent method client can safely retransmission request
What is HTTP long connection
TCP is the application layer
HTTP persistent connection, HTTP 1.1
Short connections: data transmission connection is established ... ... close the connection (establish and close large cost connected)
Long connection: Connection: Keep-alive TCP connection continues to remain open
How to distinguish between different HTTP request?
Content-Length | Transfer-Encoding: chunked
- The client tells the server to send an HTTP request how long?
- Content-Length header tells the browser message entity body size
cookie and session difference?
HTTP is stateless, the user how to identify it?
General Session server is generated after the client (via the url parameter or cookie)
Cookie is a mechanism to achieve the session, achieved through HTTP cookie field
Session sessionid identify users by saving in servers, cookie stored in the client
to sum up:
Composition of requests and responses
Common HTTP method and idempotent
Long connection, session and cookie
Network programming often questions
socket programming is more important for the principle of learning framework
TCP / UDP socket programming, HTTP programming
Understanding TCP programming principles
UDP understanding of programming principles
Learn how to send an HTTP request
TCP socket programming principles?
How to use the socket module
How to create TCP socket client and server
Communication between the client and server
# !< server.py
import socket
import time
s = socket.socket()
s.bind('', 8888)
s.listen()
while True:
client, addr = s.accept() #return conn, addr
print(client)
timestr = time.ctime(time.time()) + '\r\n'
client.send(timestr.encode()) #send 参数 encode
client.close()
# !< client.py
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('127.0.0.1', 8888))
s.sendall(b'Hello World')
data = s.recv(1024)
print(data.decode())
s.close()
Send an HTTP request using socket
How to use the socket to send an HTTP request
Send an HTTP request using the socket interface
HTTP built on the basis of TCP
HTTP is a text-based protocol
import socket
s = socket.socket()
s.connect(('www.baidu.com', 80))
http = b"GET /HTTP/1.1\r\nHost: www.baidu.com\r\n\r\n"
s.sendall(http)
buf = s.recv(1024)
print(buf)
s.close()
#接受完整的响应
IO multiplexing often questions
Five kinds of IO models
5 kinds of networks mentioned Unix Network Programming Model
Blocking IO
Nonblocking IO
IO multiplexing
Signal Driven IO
Asynchronous IO
Two non common, multiplexing is generally used more IO
How to improve concurrency server do?
- Multi-threading model, create a new thread to process the request
- Multi-process model, create a new process to handle requests
Thread / process creation overhead is relatively large, can be resolved by way of the thread pool
- Threads and processes relatively resource-intensive, difficult to create too much at the same time
IO multiplexing, a plurality of single-socket processes handle requests
What is the IO multiplexing?
Mechanism while the operating system's monitor multiple socket
To achieve high concurrency requires a mechanism to handle multiple concurrent socket
linux common select / poll / epoll
You can use a single process to handle multiple single-threaded socket
while True:
events = sel.select()
for key, mask in events:
callback = key.data
callback(key.fileobj, mask)
How to achieve py IO multiplexing?
py packaging operations multiplexing system IO
The IO py multiplexing based operating system implementation (select / poll / epoll)
py2 select module
py3 selectors module
selectors module
Event Type: EVENT_READ, EVENT_WRITE
DefaultSelector: IO automatically select the appropriate model based on the platform
register(fileobj, events, data = None)
unregister(fileobj)
modify(fileobj, events, data=None)
select(timeout=None): returns[(key, events)]
close()
py concurrent network libraries often questions
Tornado / peddled
What concurrent network library used?
Tornado vs Gevent vs Asyncio
Tornado concurrent network library and also a micro web framework
Gevent green thread (greenlet) concurrency, monkey patch to modify the built-in socket
Asyncio Python3 concurrent network built library, based on native coroutine
Tornado framework
Tornado applicable to micro-services, and Restful Interface
Linux-based underlayer multiplexer
Asynchronous programming may be implemented by the callback or coroutine
ORM respective asynchronous frame such imperfect
import tornado.ioloop
import tornado.web
from tornado.httpclient import AsyncHTTPClient
class APIHandler(tornado.web.Requestandler):
async def get(self):
url = 'http://httpbin.org/get'
http_client = AsyncHTTPClient()
resp = http_client.fetch(url)
print(resp.body)
return resp.body
def make_app():
return tornado.web_Application([
(r"/api", APIHandler)
])
if __name__ == "__main__":
app = make_app()
app.listen(8888)
tornado.ioloop.IOLoop.current().start()
peddled
High-performance concurrent network library
Lightweight green thread concurrency (greenlet)
Note monkey patch, gevent modify the built-in socket replaced by a non-blocking
Gevent with gunicorn and deployed as wsgi server
import gevent.monkey
gevent.monkey.patch_all() #修改内置的一些库非阻塞
import gevent
import requests
def fetch():
url = 'http://httpbin.org/get'
resp = request.get(url)
print(len(resp.text), i)
def asynchronous():
threads = []
for i in range(1, 10):
threads.append(gevent.spawn(fetch, i))
gevent.joinall(threads)
print('Asynchronous: ')
asynchronous()
Asyncio
Built-in network-based concurrency library coroutines implemented
py3 into the built-in library, coroutine + event loop
Based Aiohttp can achieve some small service
aiohttp concurrent requests based on
import asyncio
from aiohttp import Client Session #pip install aiohttp
async def fetch(url, session):
async with session.get(url) as response:
return await response.read()
async def run(r=10):
url = "http://httpbin.org/get"
tasks = []
async with ClientSession() as Session:
for i in range(r):
task = asyncio.ensure_future(fetch(url, session))
tasks.append(task)
responses = await asyncio.gather(*tasks)
for resp_body in responses:
print(len(resp_body))
loop = asyncio.get_event_loop()
future = asyncio.ensure_future(run())
loop.run_until_complete(future)
Asynchronous write a reptile
Use gevent py or asyncio asynchronous write a reptile
This class may need to pass a list of URLs to crawl
Such a processing method of response may be provided by way of inheritance