There are many uses for sliding windows, such as the maximum value of 3 consecutive orders
introduce
Window Function (Window Function) is a function used in relational databases, usually used to calculate data within a certain range. Window functions are also a very useful tool in data analysis, making it easy to apply sliding windows to data, calculate moving averages, moving sums, and more.
In this article, we will use the pandas library to implement window functions. pandas is a popular data processing library that provides many tools for data processing and analysis, including window functions.
sample data
To demonstrate the use of window functions, we will use an example dataset containing monthly sales data.
import pandas as pd
data = {'Month': ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'],
'Sales': [10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120]}
df = pd.DataFrame(data)
df
Month Sales
0 Jan 10
1 Feb 20
2 Mar 30
3 Apr 40
4 May 50
5 Jun 60
6 Jul 70
7 Aug 80
8 Sep 90
9 Oct 100
10 Nov 110
11 Dec 120
moving average
A moving average is a common window function used to calculate an average over a period of time. In pandas, we can use rolling function to calculate moving average.
df['MA'] = df['Sales'].rolling(window=3).mean()
In the above code, we have used the rolling function and set the window size to 3, i.e. calculate the average every three months. The calculated result will be stored in a new column named "MA".
moving sum
Moving sums are another common window function used to calculate sums over time. In pandas, we can use rolling function to calculate moving sum.
df['MS'] = df['Sales'].rolling(window=3).sum()
In the above code, we have used the rolling function and set the window size to 3, i.e. calculate the sum every three months. The calculated result will be stored in a new column named "MS".
df['MA'] = df['Sales'].rolling(window=3).mean()
...
df
Month Sales MA
0 Jan 10 NaN
1 Feb 20 NaN
2 Mar 30 20.0
3 Apr 40 30.0
4 May 50 40.0
5 Jun 60 50.0
6 Jul 70 60.0
7 Aug 80 70.0
8 Sep 90 80.0
9 Oct 100 90.0
10 Nov 110 100.0
11 Dec 120 110.0
maximum and minimum
In addition to moving averages and moving sums, we can also use the rolling function to calculate maximum and minimum values over a period of time.
df['Max'] = df['Sales'].rolling(window=3).max()
df['Min'] = df['Sales'].rolling(window=3).min()
In the above code, we used the rolling function and set the window size to 3 to calculate the maximum and minimum values respectively. The calculated results will be stored in new columns named "Max" and "Min".
df
Month Sales MA Max Min Sum
0 Jan 10 NaN NaN NaN NaN
1 Feb 20 NaN NaN NaN NaN
2 Mar 30 20.0 30.0 10.0 60.0
3 Apr 40 30.0 40.0 20.0 90.0
4 May 50 40.0 50.0 30.0 120.0
5 Jun 60 50.0 60.0 40.0 150.0
6 Jul 70 60.0 70.0 50.0 180.0
7 Aug 80 70.0 80.0 60.0 210.0
8 Sep 90 80.0 90.0 70.0 240.0
9 Oct 100 90.0 100.0 80.0 270.0
10 Nov 110 100.0 110.0 90.0 300.0
11 Dec 120 110.0 120.0 100.0 330.0
in conclusion
In this article, we introduced the use of the pandas library to implement window functions. We demonstrated how to use the rolling function to calculate moving averages, moving sums, maximums, and minimums.
Follow me, like me, comment me