Python 3.9 beta2 release, what are these 7 new PEPs?

Original: Jake Edge

Translator: Cat under the Pea Flower @Python Cat

English: https://lwn.net/Articles/819853/

With the release of Python 3.9.0b1, the first of four beta versions planned in the development cycle, Python 3.9 is fully functional. There is still a lot of testing and stability work to be done before the final release in October.

(Annotation: The beta1 version was released on May 18, the author's article was written on May 20, and when this translation was published, beta2 was released today, June 9, which is a coincidence!)

The release notes list the seven Python Enhancement Proposals (PEPs) accepted for 3.9. We looked at some of these PEPs and saw some updates. Now seems like a good time to introduce some of the things that Python 3.9 brings.

1. String manipulation

Sometimes the simplest (indicative) things are the hardest, or at least cause huge discussions. Most of the controversy is about naming (what else?), but the idea of ​​adding functions to standard string objects to remove prefixes and suffixes is uncontroversial.

It was unclear whether those affixes (collectively called prefixes and suffixes) could be specified as a sequence to handle multiple affixes in one call, and it was finally removed from the proposal, waiting for someone else to push the change again.

At the end of March, Dennis Sweeney asked core developers to support PEP 616 ("A method for removing prefixes and suffixes from strings") on the python-dev mailing list. He pointed to the python-ideas discussions on the topic since March 2019. Eric V. Smith agreed to support the PEP, which prompted Sweeney to publish and initiate discussions.

In the original version, he used cutprefix() and cutsuffix() as method names to add to string objects. Four types of Python objects will get new methods: str (Unicode strings), byte (binary sequences), bytearray (mutable binary sequences), and collections.UserString (a wrapper around string objects).

It is written as follows:

'abcdef'.cutprefix('abc') # 返回'def'
'abcdef'.cutsuffix('ef') # 返回'abcd'

For the naming part, a whole bunch of suggestions came up. Basically few people like "cut", so "strip", "strim" and "remove" were brought up and all gained some support.

stripprefix() and stripsuffix() are at least partially deprecated for a reason indicated in the PEP; the existing "strip" function is confusing, and reuse of that name should be avoided.

The str.lstrip() and str.rstrip() methods are also used to remove leading and trailing characters, but they are a source of confusion for programmers who are really looking for the cutprefix() functionality.

*strip() receives a string argument when called, but treats it as a set of characters and strips it from the beginning or end of the string:

'abcdef'.lstrip('abc') # 返回“def”,符合预期
'abcbadefed'.lstrip('abc') # 返回'defed',完全不符合预期

In the end, removeprefix() and removesuffix() seem to have the upper hand, which is what Sweeney ended up with. Guido van Rossum also supports these names.

Eric Fahlgren hilariously summed up the naming debate this way:

> I think the name selection is easier if you write the documentation first: > > cutprefix - removes the specified prefix. > > trimprefix - removes the specified prefix. > > stripprefix - removes the specified prefix. > > removeprefix - removes the specified prefix. nonsense:)

Sweeney updated the PEP, responding to many comments, but also adding the ability to propose string tuples as affixes (the version can be seen in the PEP GitHub repo).

But Steven D'Aprano isn't sure it makes sense to do so. He points out that the only string operations that take a tuple argument are str.startswith() and str.endswith(), which don't return a string (just a boolean). He suspects adding this one way of accepting a tuple argument but returning a string, because whatever rule is chosen to handle tuples is, for some people, the "wrong" choice.

E.g:

> The difficulty here is that the notion of "cutting one of these prefixes" is ambiguous if both or more prefixes can match. There is no difference to startwith: > > python > "extraordinary".startswith(('ex', 'extra')) > > > Because it is from left to right, from shortest to largest, and even random order matching is True. But for cutprefix, which prefix should be removed?

As he said, the suggested rule is to use left-to-right processing of the tuple's first matching string, but some may want the longest match or the last match; it all depends on the context of use. He suggested giving the feature more "soak time" before committing to add such behavior: "Before adding support for multiple prefixes/suffixes, we should first do some practical work on simple cases experience.”

Ethan Furman agrees with D'Aprano. But Victor Stinner is strongly in favor of the idea of ​​tuple parameters, except that he also wants to know what to do when the incoming tuple has an empty string. According to the PEP proposal, encountering an empty string (which can actually match anything) when dealing with a tuple just returns the original string, which leads to surprising results:

cutsuffix("Hello World", ("", " World"))    # 返回 "Hello World"
cutsuffix("Hello World", (" World", ""))    # 返回 "Hello"

This example is less obvious; affixes are not necessarily hardcoded, so empty strings can slip into unexpected places. Stinner suggests throwing ValueError if an empty string is encountered, similar to str.split(). But Sweeney decided to remove the tuple parameter feature entirely in order to "allow someone with a stronger view of it to propose and defend a range of semantics in a separate PEP". He released the latest version of the PEP on March 28th.

On April 9, Sweeney opened a steering committee issue requesting a review of his PEP. On April 20, Stinner accepted the proposal on behalf of the committee.

It's a small change, but it's worth taking the time to make sure it has a long-term applicable interface (and semantics). We'll see removeprefix() and removesuffix() in Python 3.9.

2. New parser

Not surprisingly, the steering committee has embraced the new CPython parser we introduced in mid-April. PEP 617 ("CPython's New PEG Parser") was proposed by Python founders, former Benevolent Dictator (BDFL) Guido van Rossum, along with Pablo Galindo Salgado and Lysandros Nikolaou.

It already works well and provides a performance improvement of less than 10% over the speed and memory usage of existing parsers. Since the parser is based on Parsing Expression Grammar (PEG), the language specification will also be simplified. CPython's existing LL(1) parser has many shortcomings and some hacks that the new parser will eliminate.

This change paves the way for Python to move beyond LL(1) syntax, although the existing language is not exactly LL(1). This change won't be too soon, as the plan is to provide switches on the command line in Python 3.9, keeping the existing parser available.

But Python 3.10 will remove the existing parser, which may lead to language changes. If those changes were made, then other Python implementations (such as PyPy and MicroPython) would need to switch LL(1) implementations of the parser to keep up with the language specification. This may put the core developers on hold for such changes.

3. More content

We reviewed PEP 615 ("Support for the IANA Time Zone Database in the Standard Library") in early March. It will add a zoneinfomodule to the standard library that will facilitate getting time zone information from the IANA time zone database (also known as the "Olson database") to populate time zone objects. As of this writing, it looks smooth.

At the end of March, Paul Ganssle requested a resolution on the PEP. He thought it might be interesting to embrace it in an interesting time frame:

> ... I would like (for whimsical reasons) to accept it on Sunday, April 5th between 02:00-04:00 or 13:00-17:30 UTC, as these times represent Unspecified time in some places (mostly in Australia). There is another timing, which is between 01:00-03:00 UTC on Sunday, April 19th, which is indeterminate in Western Sahara.

He realized that this might be difficult to achieve, and it certainly wasn't a priority. The Steering Committee did not miss the second window too much; Barry Warsaw announced acceptance of the PEP on April 20.

Python will now have a mechanism to access the system's timezone database for creating and manipulating timezones. Additionally, there is a tzdata module in the Python Package Index (PyPI) that provides IANA data for systems that lack it; it will be maintained by the Python core developers.

PEP 593 ("Flexible Function and Variable Annotations") adds a way to associate context-specific metadata with functions and variables. In fact, type hint annotations have crowded out other use cases envisioned in PEP 3107 ("function annotations") implemented in Python 3.0 many years ago. PEP 593 creates a new mechanism for these use cases using Annotated type hints.

PEP 585 ("Type Hint Generics in Standard Collections") provides another cleanup. It will allow the removal of a parallel set of type aliases maintained in the typing module in favor of generics. For example, the type.List type will no longer need to support annotations such as "dict[str,list[int]]" (eg, a dictionary with a string key and a list of integer values).

Union operations for dictionary "addition" will also be part of Python 3.9. It has been controversial from time to time, but in mid-February, PEP 584 ("Adding Union Operators to Dictionaries") was recommended for adoption by Van Rossum. The steering committee quickly agreed, and the feature was merged on February 24.

The last PEP was PEP 602 ("Python's Annual Release Cycle"). As the proposal states, it changes the release cadence from every 18 months to an annual. However, the development and release cycles overlap, so the entire feature development takes 12 months. Feature development for Python 3.10 began when the first Python 3.9 beta version was released (which is now). Stay tuned for the next round of PEPs in the coming year.

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324064562&siteId=291194637