Pandas | 15 window functions

In order to process digital data, Pandas provides several variants, such as scrolling, and the right to expand exponential moving window statistics weight. Including the sum, mean, median, variance, covariance and correlation. This chapter is a discussion of the application of these methods in DataFrame object.

.rolling () function

This function can be applied to a series of data. Specify the window=nparameters, and apply the appropriate statistical functions.

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(10, 4),index = pd.date_range('1/1/2020', periods=10),columns = ['A', 'B', 'C', 'D'])

print(df)
print('\n')

print (df.rolling(window=3).mean())

Output:

                   ABCD 
2020-01-01 0.517788 0.524324 0.723912 -0.316153
2020-01-02 0.553257 -0.489424 -0.942906 -0.002625
2020-01-03 -0.113628 0.291778 1.192297 -1.216583
2020-01-04 -1.006843 0.549378 2.526383 -0.209177
2020-01-05 0.646680 0.249695 0.502700 -0.420748
2020-01-06 -0.323045 -0.962962 0.035932 -1.342486
2020-01-07 -1.209534 0.138791 0.756402 0.229242
2020-01-08 -0.473912 -1.734865 0.269594 -0.293566
2020-01-09 2.144167 0.508603 0.076023 -0.246540
2020 -01-10 -0.199808 0.887562 0.196244 0.831584


ABCD
2020-01-01 in the in the
2020-01-02 in the in the
2020-01-03 0.319139 0.108892 0.291173 -0.478526
2020-01-04 0.329669 0.776245 0.326831 -1.055444
2020-01-05 0.360810 1.022618 0.495273 -0.881391
2020-01-06 -0.463887 0.291004 0.604372 -0.349655
-0.295300 -0.191492 0.123862 -0.203515 2020-01-07
2020-01-08 -0.668830 -0.853012 0.353976 -0.468937
2020-01-09 -0.362490 0.153574 0.367340 -0.103622
2020-01-10 -0.112900 0.490149 0.180621 0.097159

Note - Because the window size 3( window), the value would be the third element n, n-1and n-2the elements of the average value. So this can be applied a variety of the above-mentioned functions.

.expanding () function

This function can be applied to a series of data. Specify the min_periods = nparameters and statistics in their proper function of the application.

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(10, 4),
      index = pd.date_range('1/1/2018', periods=10),
      columns = ['A', 'B', 'C', 'D'])

print(df)
print('\n')

print (df.expanding(min_periods=3).mean())

Output:

                   ABCD 
2018-01-01 -0.440860 0.246692 0.511610 -0.241488
2018-01-02 -0.287958 1.554392 -0.870998 -0.141933
2018-01-03 -0.219975 -0.217251 3.032686 -0.800669
2018-01-04 -0.297885 0.336629 -0.313112 -0.633826
2018- -0.226151 -0.266663 0.988562 -0.424164 01-05
2018-01-06 -0.641176 -2.556270 1.907479 0.779536
2018-01-07 -0.333231 0.022907 1.784900 1.075321
2018-01-08 -1.045178 0.295636 0.127447 -1.417171
2018-01-09 1.048741 0.841395 0.104583 1.015302
2018-01-10 -0.209738 0.333223 -1.279857 -0.380164


ABCD
2018-01-01 in the in the
2018-01-02 in the in the
2018-01-03 -0.087081 0.616250 0.573609 -0.394697
2018-01-04 -0.139782 0.546345 0.351929 -0.454479
2018-01-05 -0.157056 0.383744 0.479256 -0.448416
2018-01-06 -0.237742 - 0.106259 0.717293 -0.243757
2018-01-07 -0.200507 -0.138683 0.869808 -0.055318
2018-01-08 -0.306091 -0.084393 0.777013 -0.225549
2018-01-09 -0.155554 0.018472 0.702299 -0.087677
2018-01-10 -0.160972 0.049947 0.504083 -0.116926

.ewm () function

ewm()It can be applied to the data series. Designated com, span, halflifeparameters, and statistical functions on its proper application. It exponentially assign weights.

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(10, 4),
   index = pd.date_range('1/1/2019', periods=10),
   columns = ['A', 'B', 'C', 'D'])

print(df)
print('\n')

print (df.ewm(com=0.5).mean())

Output:

                   A         B         C         D
2019-01-01 1.204552 -0.936226 0.629811 -0.424075
2019-01-02 0.593300 -0.356715 0.313949 0.547324
2019-01-03 0.545719 -1.061298 0.578605 -0.290907
2019-01-04 -1.146018 1.585733 0.520032 -0.705019
2019-01-05 -0.773724 0.907562 0.948446 -0.427746
2019-01-06 -0.033501 -1.787833 -1.978037 0.304845
2019-01-07 0.689540 -0.457179 1.584107 1.932602
2019-01-08 1.052232 0.135262 0.246501 0.698567
2019-01-09 0.124396 -1.289378 0.279960 -0.896865
2019-01-10 -1.083088 0.399733 0.903997 -0.738203


A B C D
2019-01-01 1.204552 -0.936226 0.629811 -0.424075
2019-01-02 0.746113 -0.501593 0.392915 0.304474
2019-01-03 0.607378 -0.889081 0.521470 -0.107713
2019-01-04 -0.576164 0.781418 0.520499 -0.510895
2019-01-05 -0.708415 0.865861 0.806976 -0.455233
2019-01-06 -0.257855 -0.905698 -1.052250 0.052182
2019-01-07 0.374031 -0.606548 0.706125 1.306369
2019-01-08 0.826234 -0.111933 0.399662 0.901106
2019-01-09 0.358318 -0.896936 0.319857 -0.297602
2019-01-10 -0.602636 -0.032475 0.709290 -0.591341

 

 

Window function is mainly used to reflect the trend of the data by a smooth curve. If there are many variations in the daily data, and there are many data points available, the sampling and mapping is a method, an application window pattern calculated and plotted on the result is another method. By these methods, a smooth curve or trend.




Guess you like

Origin www.cnblogs.com/Summer-skr--blog/p/11704746.html