Pandas | 15 window functions

In order to process digital data, Pandas provides several variants, such as scrolling, and the right to expand exponential moving window statistics weight. Including the sum, mean, median, variance, covariance and correlation. This chapter is a discussion of the application of these methods in DataFrame object.

.rolling () function

This function can be applied to a series of data. Specify the window=nparameters, and apply the appropriate statistical functions.

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(10, 4),index = pd.date_range('1/1/2020', periods=10),columns = ['A', 'B', 'C', 'D'])

print(df)
print('\n')

print (df.rolling(window=3).mean())

Output:

                   ABCD 
2020-01-01 0.517788 0.524324 0.723912 -0.316153 
2020-01-02 0.553257 -0.489424 -0.942906 -0.002625 
2020-01-03 -0.113628 0.291778 1.192297 -1.216583 
2020-01-04 -1.006843 0.549378 2.526383 -0.209177 
2020-01-05 0.646680 0.249695 0.502700 -0.420748 
2020-01-06 -0.323045 -0.962962 0.035932 -1.342486 
2020-01-07 -1.209534 0.138791 0.756402 0.229242 
2020-01-08 -0.473912 -1.734865 0.269594 -0.293566 
2020-01-09 2.144167 0.508603 0.076023 -0.246540 
2020 -01-10 -0.199808 0.887562 0.196244 0.831584 


                   ABCD 
2020-01-01 in the in the
2020-01-02 in the in the 
2020-01-03 0.319139 0.108892 0.291173 -0.478526 
2020-01-04 0.329669 0.776245 0.326831 -1.055444 
2020-01-05 0.360810 1.022618 0.495273 -0.881391 
2020-01-06 -0.463887 0.291004 0.604372 -0.349655 
-0.295300 -0.191492 0.123862 -0.203515 2020-01-07 
2020-01-08 -0.668830 -0.853012 0.353976 -0.468937 
2020-01-09 -0.362490 0.153574 0.367340 -0.103622 
2020-01-10 -0.112900 0.490149 0.180621 0.097159

Note - Because the window size 3( window), the value would be the third element n, n-1and n-2the elements of the average value. So this can be applied a variety of the above-mentioned functions.

.expanding () function

This function can be applied to a series of data. Specify the min_periods = nparameters and statistics in their proper function of the application.

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(10, 4),
      index = pd.date_range('1/1/2018', periods=10),
      columns = ['A', 'B', 'C', 'D'])

print(df)
print('\n')

print (df.expanding(min_periods=3).mean())

Output:

                   ABCD 
2018-01-01 -0.440860 0.246692 0.511610 -0.241488 
2018-01-02 -0.287958 1.554392 -0.870998 -0.141933 
2018-01-03 -0.219975 -0.217251 3.032686 -0.800669 
2018-01-04 -0.297885 0.336629 -0.313112 -0.633826 
2018- -0.226151 -0.266663 0.988562 -0.424164 01-05 
2018-01-06 -0.641176 -2.556270 1.907479 0.779536 
2018-01-07 -0.333231 0.022907 1.784900 1.075321 
2018-01-08 -1.045178 0.295636 0.127447 -1.417171 
2018-01-09 1.048741 0.841395 0.104583 1.015302 
2018-01-10 -0.209738 0.333223 -1.279857 -0.380164 


                   ABCD 
2018-01-01 in the in the
2018-01-02 in the in the 
2018-01-03 -0.087081 0.616250 0.573609 -0.394697 
2018-01-04 -0.139782 0.546345 0.351929 -0.454479 
2018-01-05 -0.157056 0.383744 0.479256 -0.448416 
2018-01-06 -0.237742 - 0.106259 0.717293 -0.243757 
2018-01-07 -0.200507 -0.138683 0.869808 -0.055318 
2018-01-08 -0.306091 -0.084393 0.777013 -0.225549 
2018-01-09 -0.155554 0.018472 0.702299 -0.087677 
2018-01-10 -0.160972 0.049947 0.504083 -0.116926

.ewm () function

ewm()It can be applied to the data series. Designated com, span, halflifeparameters, and statistical functions on its proper application. It exponentially assign weights.

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(10, 4),
   index = pd.date_range('1/1/2019', periods=10),
   columns = ['A', 'B', 'C', 'D'])

print(df)
print('\n')

print (df.ewm(com=0.5).mean())

Output:

                   A         B         C         D
2019-01-01  1.204552 -0.936226  0.629811 -0.424075
2019-01-02  0.593300 -0.356715  0.313949  0.547324
2019-01-03  0.545719 -1.061298  0.578605 -0.290907
2019-01-04 -1.146018  1.585733  0.520032 -0.705019
2019-01-05 -0.773724  0.907562  0.948446 -0.427746
2019-01-06 -0.033501 -1.787833 -1.978037  0.304845
2019-01-07  0.689540 -0.457179  1.584107  1.932602
2019-01-08  1.052232  0.135262  0.246501  0.698567
2019-01-09  0.124396 -1.289378  0.279960 -0.896865
2019-01-10 -1.083088  0.399733  0.903997 -0.738203


                   A         B         C         D
2019-01-01  1.204552 -0.936226  0.629811 -0.424075
2019-01-02  0.746113 -0.501593  0.392915  0.304474
2019-01-03  0.607378 -0.889081  0.521470 -0.107713
2019-01-04 -0.576164  0.781418  0.520499 -0.510895
2019-01-05 -0.708415  0.865861  0.806976 -0.455233
2019-01-06 -0.257855 -0.905698 -1.052250  0.052182
2019-01-07  0.374031 -0.606548  0.706125  1.306369
2019-01-08  0.826234 -0.111933  0.399662  0.901106
2019-01-09  0.358318 -0.896936  0.319857 -0.297602
2019-01-10 -0.602636 -0.032475  0.709290 -0.591341

Window function is mainly used to reflect the trend of the data by a smooth curve. If there are many variations in the daily data, and there are many data points available, the sampling and mapping is a method, an application window pattern calculated and plotted on the result is another method. By these methods, a smooth curve or trend.