Basics of programming in Python, for example the Async / Await the

Source: Redislabs

Author: Loris Cro

Translation: Kevin (Public number: Middleware little brother)

 

In recent years, many programming languages in their efforts to improve concurrency primitives. Go language has goroutines, Ruby has fibers, of course, the popularity of Node.js help  the async / the await , which is today the most widely used type of concurrent operations. In this article, I will discuss the basics of python, for example async / await the. I chose python language, because this feature is relatively new in python 3, many users may not be familiar with it. Main reason for using async / await is to improve the procedure by reducing the idle time of the I / O execution throughput. Using this procedure by the operator using abstract implicitly called event loop to process multiple execution paths. In some ways, the event loop is similar to multi-threaded programming, but the event loop is typically present in a single thread, so it can not perform multiple calculations simultaneously. Because of this, a separate event loop does not improve the performance of compute-intensive applications. However, the program for a large number of network communication, such as connection to the application Redis database, it can greatly improve performance. Each time the program sends a command to the Redis, it will wait for a response Redis if Redis deployed on another machine, network delay occurs. And do not use a single-threaded application event loop is idle while waiting for a response, will take up a lot of CPU cycles. It should be noted that the network latency in milliseconds, but CPU to execute instructions require nanosecond, this difference of six orders of magnitude. Here, for example, the following code sample is used to track the leaderboard to win a game. Each stream entry includes the name of the winner, we will update the program ordered set (Sorted Set) a Redis, and this is used as an ordered set of rankings. Our main concern here is blocking code and performance non-blocking code.

 1 import redis
 2 
 3 # The operation to perform for each event
 4 def add_new_win(conn, winner):
 5     conn.zincrby('wins_counter', 1, winner)
 6     conn.incr('total_games_played')
 7 
 8 def main():
 9     # Connect to Redis
10     conn = redis.Redis()
11     # Tail the event stream
12     last_id = '$' 
13     while True:
14         events = conn.xread({'wins_stream': last_id}, block=0, count=10)
15         # Process each event by calling `add_new_win`
16         for _, e in events:
17             winner = e['winner']
18             add_new_win(conn, winner)
19             last_id = e['id']
20 
21 if __name__ == '__main__':
22 main()

  

We use the aio-libs / aioredis asynchronous version has the same effect as the above code. aio-libs Python community is rewriting many network library to include support for the asyncio, asyncio event loop is Python standard library implementation. The following is a non-blocking version of the above code:

 1 import asyncio
 2 import aioredis
 3 
 4 async def add_new_win(pool, winner):
 5     await pool.zincrby('wins_counter', 1, winner)
 6     await pool.incr('total_games_played')
 7 
 8 async def main():
 9     # Connect to Redis
10     pool = await aioredis.create_redis_pool('redis://localhost', encoding='utf8')
11     # Tail the event stream
12     last_id = '$'
13     while True:
14         events = await pool.xread(['wins_stream'], latest_ids=[last_id], timeout=0, count=10)
15         # Process each event by calling `add_new_win`
16         for _, e_id, e in events:
17             winner = e['winner']
18             await add_new_win(pool, winner)
19             last_id = e_id
20 
21 if __name__ == '__main__':
22     loop = asyncio.get_event_loop()
23     loop.run_until_complete(main())
 

This code is compared with the above part of the code, in addition to a few more await keywords, the other is almost identical. The biggest difference in the last two lines. In Node.js, the environment will be loaded by default event loop, but in Python, you must turn on the display.
 After the rewrite, we may think that you can do to improve the performance. Unfortunately, non-blocking version of our code yet improves performance. The problem here is that we write the details of the code, not just the general idea of using async / await the.

Await use restrictions

The main problem after we rewrite the code that we overused await. When we called the front plus asynchronous await, we do two things:

1. In order to make the appropriate scheduling

2. Wait for completion

Sometimes, this is right. For example, before the completion of the reading of the flow line 15, we can not iterate for each event. In this case, await keyword is meaningful, but look add_new_win method:

1 async def add_new_win(pool, winner):
2     await pool.zincrby('wins_counter',  1, winner)
3     await pool.incr('total_games_played')

 

In this function, the second operation does not depend on the first operation. We can second command is sent with the first command, but when we send the first command, await will block the flow of execution. In fact, we want a better method can perform these two operations at once. To this end, we need a different synchronization primitives.

1 async def add_new_win(pool, winner):
2     task1 = pool.zincrby('wins_counter', 1, winner)
3     task2 = pool.incr('total_games_played')
4     await asyncio.gather(task1, task2)

 

First, an asynchronous function call will not execute any code, but will first instantiate a "mission." According to the language selection, which may be called coroutine, promise or future, and so on. For us, the task is an object that represents a value that is only used after await or other synchronization primitives (such as asyncio.gather) is available. In the official documentation of Python, you can find more information about the asyncio.gather. In short, it allows us to perform multiple tasks at the same time. We need to wait for its results, because once all the input task is completed, it will create a new task. Python's equivalent to JavaScript asyncio.gather of Promise.all, C # is Task.WhenAll, Kotlin of awaitAll and so on.

Improve our main loop code

We do things add_new_win can also be used in mainstream event processing loop. This is the code I mean:

1 last_id = '$'
2 while True:
3     events = await pool.xread(['wins_stream'], latest_ids=[last_id], timeout=0, count=10)
4     for _, e_id, e in events:
5         winner = e['winner']
6         await add_new_win(pool, winner)
7         last_id = e_id

 

So far, you will notice that we treat each event sequentially. Because in line 6, can be performed using both await and can wait for the completion of add_new_win. Sometimes this is exactly what you want to happen, because if you do not execute the order, the program logic will be interrupted. In our case, we do not really care about the sort, because we just update the counter.

1 last_id = '$'
2 while True:
3     events = await pool.xread(['wins_stream'], latest_ids=[last_id], timeout=0, count=10)
4     tasks = []
5     for _, e_id, e in events:
6         winner = e['winner']
7         tasks.append(add_new_win(pool, winner))
8         last_id = e_id
9     await asyncio.gather(*tasks)

 

We are also concurrently processing each batch of events, and changes to the code is minimal. Finally, remember that sometimes even without asyncio.gather, the program can also be high performance. In particular, when you are writing code for web servers and use like Sanic asynchronous framework, which will be complicated by the way you call the request handler, even if you're waiting for each asynchronous function call, but also to ensure that the enormous throughput the amount.

to sum up

Here is the complete code example above, we conducted two changes after:

 1 import asyncio
 2 import aioredis
 3 
 4 async def add_new_win(pool, winner):
 5     # Creating tasks doesn't schedule them
 6     # so you can create multiple and then 
 7     # schedule them all in one go using `gather`
 8     task1 = pool.zincrby('wins_counter', 1, winner)
 9     task2 = pool.incr('total_games_played')
10     await asyncio.gather(task1, task2)
11     
12 async def main():
13     # Connect to Redis
14     pool = await aioredis.create_redis_pool('redis://localhost', encoding='utf8')
15     # Tail the event stream
16     last_id = '$'
17     while True:
18         events = await pool.xread(['wins_stream'], latest_ids=[last_id], timeout=0, count=10)
19         tasks = []
20         for _, e_id, e in events:
21             winner = e['winner']
22             # Again we don't actually schedule any task,
23             # and instead just prepare them
24             tasks.append(add_new_win(pool, winner))
25             last_id = e_id
26         # Notice the spread operator (`*tasks`), it
27         # allows using a single list as multiple arguments
28         # to a function call.
29         await asyncio.gather(*tasks)
30 
31 if __name__ == '__main__':
32     loop = asyncio.get_event_loop()
33     loop.run_until_complete(main())

 

In order to use non-blocking I / O, you need to rethink how to deal with network operations. The good news is it is not very difficult, you just need to know when to order of importance, does not matter what time. Aioredis try to use asynchronous or equivalent redis client to see if you can improve the throughput of the extent of the application.

More and more high-quality IT middleware technology / original / translation of the article / data / dry goods, please pay attention to "middleware little brother" public number!

Guess you like

Origin www.cnblogs.com/middleware/p/11996731.html