Using pandas to implement sliding windows

There are many uses for sliding windows, such as the maximum value of 3 consecutive orders

introduce

Window Function (Window Function) is a function used in relational databases, usually used to calculate data within a certain range. Window functions are also a very useful tool in data analysis, making it easy to apply sliding windows to data, calculate moving averages, moving sums, and more.

In this article, we will use the pandas library to implement window functions. pandas is a popular data processing library that provides many tools for data processing and analysis, including window functions.

sample data

To demonstrate the use of window functions, we will use an example dataset containing monthly sales data.

import pandas as pd

data = {'Month': ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'],
        'Sales': [10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120]}

df = pd.DataFrame(data)

df
   Month  Sales
0    Jan     10
1    Feb     20
2    Mar     30
3    Apr     40
4    May     50
5    Jun     60
6    Jul     70
7    Aug     80
8    Sep     90
9    Oct    100
10   Nov    110
11   Dec    120

moving average

A moving average is a common window function used to calculate an average over a period of time. In pandas, we can use rolling function to calculate moving average.

df['MA'] = df['Sales'].rolling(window=3).mean()

In the above code, we have used the rolling function and set the window size to 3, i.e. calculate the average every three months. The calculated result will be stored in a new column named "MA".

moving sum

Moving sums are another common window function used to calculate sums over time. In pandas, we can use rolling function to calculate moving sum.

df['MS'] = df['Sales'].rolling(window=3).sum()

In the above code, we have used the rolling function and set the window size to 3, i.e. calculate the sum every three months. The calculated result will be stored in a new column named "MS".

df['MA'] = df['Sales'].rolling(window=3).mean()
... 
df
   Month  Sales     MA
0    Jan     10    NaN
1    Feb     20    NaN
2    Mar     30   20.0
3    Apr     40   30.0
4    May     50   40.0
5    Jun     60   50.0
6    Jul     70   60.0
7    Aug     80   70.0
8    Sep     90   80.0
9    Oct    100   90.0
10   Nov    110  100.0
11   Dec    120  110.0

maximum and minimum

In addition to moving averages and moving sums, we can also use the rolling function to calculate maximum and minimum values ​​over a period of time.

df['Max'] = df['Sales'].rolling(window=3).max()
df['Min'] = df['Sales'].rolling(window=3).min()

In the above code, we used the rolling function and set the window size to 3 to calculate the maximum and minimum values ​​respectively. The calculated results will be stored in new columns named "Max" and "Min".

df
   Month  Sales     MA    Max    Min    Sum
0    Jan     10    NaN    NaN    NaN    NaN
1    Feb     20    NaN    NaN    NaN    NaN
2    Mar     30   20.0   30.0   10.0   60.0
3    Apr     40   30.0   40.0   20.0   90.0
4    May     50   40.0   50.0   30.0  120.0
5    Jun     60   50.0   60.0   40.0  150.0
6    Jul     70   60.0   70.0   50.0  180.0
7    Aug     80   70.0   80.0   60.0  210.0
8    Sep     90   80.0   90.0   70.0  240.0
9    Oct    100   90.0  100.0   80.0  270.0
10   Nov    110  100.0  110.0   90.0  300.0
11   Dec    120  110.0  120.0  100.0  330.0

in conclusion

In this article, we introduced the use of the pandas library to implement window functions. We demonstrated how to use the rolling function to calculate moving averages, moving sums, maximums, and minimums.

Follow me, like me, comment me

Come

 

Guess you like

Origin blog.csdn.net/alike_u/article/details/129690480