One Article Mastering All of Python's Built-in zip()

zip() is one of the best built-in types in Python. It can take multiple iterables as arguments and return an iterator that combines elements of different iterables.

When I wrote the iterator series before, I briefly introduced it in " Advanced Python: The Iterator Pattern of Design Patterns ". A few days ago, I translated the PEP-618 adopted by Python 3.10 , and introduced what it will usher in. change.

However, there are still many students who do not know zip(), or are not proficient in its usage, so this article intends to do a more detailed review.

The content is mainly divided into three parts:

  • Usage part: Introduce its basic usage, advanced usage, and Sao operation usage
  • Advanced part: introduce its implementation principle, pay attention to several implementation details
  • Diverging Part: Focusing on Its Shortcomings, and How to Fix It

1. N uses of zip()

Basic usage: Combine multiple iterable objects like a zipper, and then use a for loop to retrieve them in turn, or store the results in a container such as a list, tuple, or dictionary at one time.

Its result is an iterator, the elements generated by the iterator are tuples, and the elements of the i-th tuple are respectively from the i-th element of the iterable object parameter, as shown in the above figure.

In addition, the for loop can also take out the elements in the tuple in turn, which is very convenient:

Its parameters are not required to be iterable objects of the same class, so there can be many combinations, such as:

But what if a dictionary is used as an argument to zip()? Dictionaries are in the form of key-value pairs, as opposed to single-element structures like lists.

Experiment, it can be seen that zip() will only traverse the key value of the dictionary by default:

If you want to get the value value of the dictionary, or get the key-value key-value pair, you can use the dictionary's own traversal methods values() and items():

Using zip(), you can also easily convert two-dimensional lists to rows and columns:

The asterisk (*) operator in the above example can unpack (unpacking), that is, the elements of my_list (which is also a list) are split into multiple arguments to zip(), thereby recombining the 3 lists.

The unpacking operator is also applicable to zip objects, because zip() itself is a row-column conversion operation. If it is unpacked as a parameter to zip(), it is equivalent to doing a row-column conversion again, that is, returning to the origin (except for the last the result is a tuple):

Finally, another usage is introduced: create a square matrix of n*n, with the same number in each row.

2. Principle analysis of zip()

The official documentation gives the Python pseudocode of zip() (not a built-in implementation of the Python interpreter, just to show the basic code logic):

def zip(*iterables):
    # zip('ABCD', 'xy') --> Ax By
    sentinel = object()
    iterators = [iter(it) for it in iterables]
    while iterators:
        result = []
        for it in iterators:
            elem = next(it, sentinel)
            if elem is sentinel:
                return
            result.append(elem)
        yield tuple(result)

In this short code, several key pieces of information can be analyzed:

  • zip accepts a variable number of iterable object arguments, which are processed into an iterator by iter(). Corollary: If there is a non-iterable object, an error will be reported here
  • The while loop is judging whether the list is empty, and the elements in the list are iterators that convert the parameters. Corollary: The while loop is always true if there is a valid iterable for the input; if there is no input, then nothing is done
  • next() will sequentially read the next element in the iterator, and its second parameter will be the return value when the iterator is exhausted. Corollary: each round takes out one element of these iterators in turn, and when an iteration is exhausted, the infinite loop is exited, which means that the iterators that are not exhausted will be discarded directly

3. Problems and solutions of zip()

The most obvious problem with zip() is that it discards unexhausted iterators:

This is a barrel effect, where the final result is determined by the shortest plank.

One solution is to take the long board and make up for the short board at the same time (fill with None value), which is the zip_longest method in itertools:

It is populated with redundant data while maximizing the integrity of the original data.

But what if we don't want redundant data, and just want longest aligned data?

Python officials have recently adopted PEP-618, which addresses this issue. When the iterator length is inconsistent, it does not compromise with the short board or long board, but throws a ValueError. It believes that the input parameter value is wrong, that is, the data integrity of the input parameter is strictly required.

This PEP will be merged into Python 3.10 a year from now. For more details, see this translation of PEP-618 .


Public number: Python cat

Headline number: Python cat

Knowing: The cat under the pea flower

Nuggets: Cats under the Peas

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324161662&siteId=291194637