9 Tips to Improve Python Performance

Original address:9 tips to improve Python performance 

Python's performance is poor compared to languages ​​like Java. Use these tips to identify and fix problems in your Python code to tune its performance.

Optimized apps and websites start with well-built code. However, the reality is that you don't need to worry about performance for 90% of your code, and for many scripts it may be 100%. If the ETL script only runs once or nightly, it doesn't matter if it takes a second or a minute.

However, this does matter if the user is forced to wait for a slow application to complete a task or a web page to display results. Even then, it's possible that only a small portion of the codebase is to blame.

The biggest performance benefits often come from performance planning before coding even starts, rather than after performance sluggishness occurs. That said, there are many ways application developers can address code performance issues.

The following nine tips are specific to Python performance, although some of them apply to other languages ​​as well:

  1. Choose the correct data type.

  2. Learn about standard functions, methods, and libraries.

  3. Find performance-focused libraries.

  4. Understand the different understandings.

  5. Use generator functions, patterns, and expressions.

  6. Consider how to deal with big data.

  7. Run the profile to identify problematic code.

  8. Consider CPython alternatives.

  9. Focus on meaningful improvements.

Choose the right data type

Best data types to use with collections. As long as you have a collection, you can easily create a list. You can use lists in place of sets or tuples almost anywhere, and lists can do much more than that.

However, some operations are faster using sets or tuples, and both types generally use less memory than lists. To choose the best type for a collection, you must first understand the data you are working with and the operations you want to perform.

Using timeit we can see that testing membership with a set is much faster than testing with a list:

> python testtimeit_data_type.py

Execution time (with list): 7.966896300087683 seconds

Execution time (with settings): 4.913181399926543 seconds

Sometimes it's faster to create a temporary set or tuple from a list. For example, to find common values ​​in two lists, it might be faster to create two sets and use set junction(). This depends on data length and operations, so it's best to test with the data and operations you expect.

Learn about standard functions, methods, and libraries

Learn about Python's standard features. Modules are optimized and will almost always be faster than the code you write. Many Python functions are rarely needed, and it's easy to forget they exist. You may remember set intersection() , but if you need them, will you remember Difference() or isdisjoint() ?

Browse the Python documentation occasionally(https://docs.python.org/3/library), or when you encounter performance issues question, find out what features are available. When you start a new project, read carefully those sections that are relevant to your own work.

If you only have time to work on one module, make it itertools - if you think you could use its functionality, consider installing more-itertools(https:// github.com/more-itertools/more-itertools) (not in the standard library). Some features of itertools may not seem useful at first, but don't be surprised if you're working on something and remember what you saw in intertools that can help. Even if you don't see one available right away, it's good to know what's available.

Find performance-focused libraries

If you're doing something big, chances are someone has already created a well-performing library to help you. You may need multiple libraries to provide functionality for different areas of your project, which may include some or all of the following:

  • Scientific Computing.

  • Vision

  • Machine learning.

  • Report.

  • Integrate.

Pay attention to release dates, documentation, support, and community. The performance of older libraries may no longer be as good as it once was, and you may need help from someone familiar with the given library to achieve the performance you want.

Multiple libraries often provide the same feature set. To determine which one to choose, create a quick test for each with data that suits your needs. You may find that one is easier to use, or that the other provides better performance out of the box.

Pandas

Pandas is a commonly used data analysis library for beginners. It's worth learning for two reasons: other libraries use it or provide compatible interfaces, and you can find the most help and examples for it.

Polars

Alternatives to pandas, such as Polars, provide better performance for many operations. Polars has a different syntax that may introduce a learning curve, but it's worth a look to start a new big data project or troubleshoot performance issues in an existing project.

dask

If you frequently work with very large data, you should also consider parallel computing. This may require changes in the way data is organized and calculations performed. It also requires a lot of effort to program correctly for multi-core or multiple machines, an area where Python lags behind compared to other languages.

Dask (https://docs.dask.org/) and other libraries such as Polars can handle the complexity to fully Utilize your core or machine. If you're familiar with NumPy or pandas, Dask offers minimal learning curve for cross-core parallelization. Even if you don't use Dask, it's helpful to understand what features it offers and how to use it to prepare you for working with big data.

understand different understandings

This is a common Python performance trick: list comprehensions will be faster than for loops.

My testing came up with these times, which are impressive.

>python timeit_compressive.py

Execution time (using for): 5.180072800023481 seconds

Execution time (using list comprehension): 2.5427665999159217 seconds

But all I do is create a new list with values ​​calculated from the original values, since the prompt usually says the following:

new_list = [val*2 for val in orig_list]

The relevant question is: What will I do with this list? Any real performance gains from list comprehensions may come from using optional predicates (filters), nested comprehensions, or generator expressions instead of list comprehensions.

This example uses nested comprehensions to flatten a matrix, which generally performs better than nested loops:

flattened = [x for row in matrix for x in row]

This one uses nested comprehensions and filters:

names = [

employee.name

for manager in managers

for employee in employees

    if employee.manager_id == manager.id

]

References to list comprehensions are the most common, but set comprehensions, dictionary comprehensions, and generator expressions work the same way. Choose the correct type for the type of data you want to create.

The following example is similar to the matrix example above, but returns a set instead of a list, which is a simple way to get unique values ​​from a list.

unique = {x for row in matrix for x in row}

The only difference between list comprehension and set comprehension is that set comprehension uses curly brackets {} instead of square brackets [].

Using generator functions, patterns, and expressions

Generators are a great way to reduce memory when iterating over large collections. If you work with large collections, you should know how to write generator functions and use the generator pattern with iterables.

Plus, learn how to use generator expressions, similar to list comprehensions. The following code sums the squares of all values ​​in a matrix without creating a list of squares.

total = sum(x**2 for row in matrix for x in row)

Consider what to do with big data

The performance of big data and large files deserves a complete and separate discussion. That said, there are some things to consider before starting a big data project. Be prepared to make the following decisions:

  • Select a database.

  • Process data in chunks.

  • Determine whether certain data can be ignored.

  • Specify the data type.

  • Use different file types.

database

For very large data, you will almost certainly use specialized libraries such as NumPy for scientific computing, pandas for data analysis, or Dask for parallelization and distributed computing. Search the internet and you'll likely find libraries for any other big data need, as well as alternatives for those needs.

block data

If your data is too large to fit into RAM, you may need to chunk it. This capability is built into Pandas as follows:

chunk_readers = pandas.read_csv("./data.csv", chunksize=2000)

for chunk in chunk_readers:

    for index, record in chunk.iterrows():

            pprint(record)

Other modules, such as Dask and Polars, have their own methods of chunking or partitioning data.

Ignore data

There is usually much more data in the data file than you need. Pandas' read_csv has a parameter usecols which allows you to specify the required columns and ignore the rest:

# Keep only named columns

data_frame = pandas.read_csv("./sales.csv", usecols=["product_id", "zip_code"])
#Retain only columns with indexes 1,8 and 20

data_frame = pandas.read_csv("./population.csv", usecols=[1,8,20])

This can significantly reduce the memory required to process the data. The .csv file may be too large for RAM, but if you only load the columns you need, you may avoid chunking.

Comprehensions are another way to remove columns and rows from a table so that you only use the data you need. To do this, the entire file must be read or chunked, and both the original container and the comprehension container must exist. If you need to ignore a large number of rows, it's better to delete the rows while iterating over the block.

Specify data type

Another way to save memory is to specify the data type once and use the smallest type required. This also improves speed because it keeps the data in the format in which calculations are fastest and eliminates the need to transform the data each time it is used. For example, a person's age fits in eight bits (0-255), so we can tell pandas to use int8 instead of int64 in that column:

data_frame = pandas.read_csv("./people.csv", dtype={"age": "int8"})

It's usually best to set the data type during load, but sometimes this isn't possible - for example, pandas can't convert a non-integer float (e.g. 18.5) to int8. It can transform the entire column after loading the data frame. Pandas has several ways to replace or modify columns and handle errors in the data:

data_frame['age'] =

            data_frame['age'].astype('int8', errors='ignore')

Dataframe astype and pandas.to_numeric can perform different types of conversions. If these don't work for your data, you may need to implement your own transformations in a loop or comprehension.

Use different file types

Most situations require you to use a common file type, such as .csv. If you need to save intermediate files during processing, using another format can be helpful.

Apache Arrow provides Python bindings to the PyArrow module. It integrates with NumPy, pandas, and Python objects, and provides methods for reading and writing datasets in other file formats. These formats are smaller, and PyArrow can read and write them faster than the Python .csv function. PyArrow also has additional features for data analysis that you may not need now, but at least you know it will be available in the future if you need it.

As mentioned earlier, Pandas is a popular data analysis library. Polars supports multiple file formats, multiple cores, and data files larger than memory. Dask partitioning handles the aforementioned chunking of larger-than-memory data files, and Dask best practices use more efficient file formats in Parquet.

Run the profile to identify problematic code

When you encounter performance problems, it's better to analyze your code rather than guess where to focus your optimization efforts. General methods to improve performance include the following steps:

  1. Create a minimal, reproducible use case that is slower than expected.

  2. Run the configuration file.

  3. Improve the function with the highest percall value.

  4. Repeat step 2 until the desired performance level is achieved.

  5. Run a real use case (not minimal) without profiling.

  6. Repeat step 1 until the desired performance level in step 5 is achieved.

With experience, you'll come to know, or at least get a feel for, where the best focus is. It may not be the code with the highest percall value. Sometimes the solution is to modify individual calculations within the loop. Other times you may need to eliminate a loop or perform computation/understanding outside of the loop. Maybe the answer is to reorganize the data, or use Dask for parallelization.

Consider CPython alternatives

The most commonly used Python implementation is CPython. All common libraries can be used with CPython, and it fully implements the language specification.

Other implementations exist, but may cause the following problems:

  • Python code may behave differently in edge cases.

  • C API modules may not work or work significantly slower.

  • If you must run CPython in the future, the additional functionality will not be available.

Despite these caveats, there are situations where alternative implementations may be best. Performance is almost never better if the computation relies heavily on C API modules like NumPy. Consider alternatives to CPython such as:

  • Jython  Runs in the JVM and allows easy integration with Java classes and objects.

  • IronPython is designed for .NET and easily integrates with the .NET ecosystem and C# libraries.

  • PyPy Uses a just-in-time (JIT) compiler and runs significantly faster than CPython unless the C API is involved.

If your needs for general-purpose modules (especially C API modules) are very limited, these and other CPython alternatives are worth considering. If your application has a heavy dependency on existing Java or .NET functionality, Jython and IronPython are good choices.

Focus on meaningful improvements

There are other commonly suggested tips and tricks for improving Python performance, such as:

  • Avoid dot notation, including Math.sqrt() or myObj.foo().

  • Use string operations such as the join() method.

  • Use multiple assignments, such as a,b = 1,2.

  • Use @functools.lru_cache decorator.

If you create timeit tests, you will see huge improvements when comparing ineffective practices to ideal practices. (BTW, you should learn how to use timeit.) However, these tricks will only have a noticeable impact on your program if both of the following conditions are true:

  1. Coding has been done "the wrong way", such as in large loops or comprehensions.

  2. These operations take a lot of time per iteration.

If each iteration of the loop takes only a second, you might not notice the small time savings of doing two assignments in one statement.

Consider where your Python optimization efforts are most important. You'll probably get more value if you focus on readability and consistency rather than performance. If your project criteria is to combine assignments or eliminate points, go for it. Otherwise, stick with standard until you run a performance test (timeit or profile) and find the problem.

Guess you like

Origin blog.csdn.net/qq_29639425/article/details/133673565