`yield from` generator vs `yield from` list performance

pt12lol :
Python 3.6.8 (default, Oct  7 2019, 12:59:55) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.9.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: def yield_from_generator(): 
   ...:     yield from (i for i in range(10000)) 
   ...:                                                                                                                                    

In [2]: def yield_from_list(): 
   ...:     yield from [i for i in range(10000)] 
   ...:                                                                                                                                    

In [3]: import timeit                                                                                                                      

In [4]: timeit.timeit(lambda: list(yield_from_generator()), number=10000)                                                                  
Out[4]: 5.3820097140014695

In [5]: timeit.timeit(lambda: list(yield_from_list()), number=10000)                                                                       
Out[5]: 4.333915593000711

I run yield from generator and yield from list many times. List version gives always better performance, while my intuition tells me rather opposite conclusions - making list requires i.e. memory allocation at startup. Why we can notice such performance differences?

Sam Mason :

the short answer is that the surface syntax makes them look more similar than they are

I'll break down a series of functions in more detail (the dis module is helpful for this), I'll separate things out into a setup cost and a cost per yielded value. we start with:

def yield_from_generator():
    yield from (i for i in range(10000))

the costs are:

  • setup: create the range object and invoke the embedded generator expression
  • per-yield: yield from the genexpr, which also invokes a next on the range iterator. note that there are two context switches here

next we look at:

def yield_from_list():
    yield from [i for i in range(10000)]

costs are:

  • setup: create a new list and populate it using a list comprehension. this uses special list op-codes so will be fast
  • per-yield: just resumes the list's iterator so is fast

next we look at a similar function:

def yield_from_list2():
    yield from list(i for i in range(10000))

this doesn't use the special list op-codes and has the double nesting of generators so is slow again. costs are:

  • setup: create a new generator expression and pass it to the list constructor, this will iterate over the generator expression that iterates over the range object
  • per-yield: uses the list's iterator so is fast again

and finally a fast version just stressing yield from:

def yield_from_generator2():
    yield from range(10000)

costs are:

  • setup: create a range object
  • per-yield: resume range iterator directly

timings of all of these on my laptop are:

yield_from_generator  639 µs
yield_from_list       536 µs
yield_from_list2      689 µs
yield_from_generator2 354 µs

hopefully it's a bit clearer now. another version is:

def yield_from_list3():
    yield from list(range(10000))

that runs in 401 µs but hopefully it's more obvious why this sits in the middle, performance wise

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=8579&siteId=1