How to speed up the DataFrame pandas

 

dataFrame python's really easy to use, but obviously only single-core computing

Use pandas, when you run the following row:

# Standard apply

df.apply(func)

Get this CPU usage:

 

 

Even if the computer has a plurality of CPU, only a fully dedicated to the calculation.

Recently recommended by the group of friends began to find the accelerator, really cow fork! ! ! You can truly experience the nuclear stand-alone with a python can also be fully open, off the eight-core thrill! !

[Pandaral·lel] The idea is to calculate the pandas distributed across all available CPU on a computer, to significantly improve the speed.

installation:

$ pip install pandarallel [--user]

Import and initialization:

Import:

from pandarallel import pandarallel

Initialization

pandarallel.initialize()

Usage is very simple:

usage:

Use with pandas DataFrame function func simple to apply and use embodiments of df, simply replace the parallel_apply classic apply.

# Standard pandas apply

df.apply(func)

# Parallel apply

df.parallel_apply(func)

 

 

 

 

 

 

 

Guess you like

Origin www.cnblogs.com/wqbin/p/12589635.html