What is the exponential moving average (EMA)?
The exponential moving average (EMA) is a commonly used smoothing method. The principle is very simple, which is to perform a weighted average on the sequence data. EMA gives more weight to recent data points and less weight to earlier data points. This effectively smoothes time series data, making it more continuous and stable.
What is the use of exponential moving average (EMA) in deep learning?
In deep learning, EMA is often used to smooth the updates of model parameters. Specifically, every time the parameters are updated, EMA processing is performed on the model parameters, thereby reducing the fluctuation of each update and making the model more stable. In addition, EMA can also be used to calculate the sliding average gradient for optimizer updates to further improve the performance and generalization ability of the model.
Interpretation of the exponential moving average (EMA) calculation formula
EMA[t] = α * x[t] + (1 - α) * EMA[t-1]
Among them, t
represents the time step, x[t]
represents t
the original data of the th time point, α
is the smoothing factor, usually takes a value between 0 and 1, represents the weight of the current sample, represents (1 - α)
the weight of historical data, EMA[t-1]
represents the EMA value of the previous time point .
The meaning of the calculation formula is to multiply the weight of the current data point by α
the current data point x[t]
, then multiply the weight of the historical data (1 - α)
by the EMA value at the previous time point EMA[t-1]
, and then add the two to get the EMA value at the current time point EMA[t]
.
Through this calculation formula, we can see that the essence of EMA is a weighted average of historical data, in which the weight of each data point decreases as it is closer to the current time point. The advantage of this is that it can effectively smooth the time series data, making it more continuous and stable.
PyTorch code implementation
Here is a simple PyTorch implementation of an exponential moving average (EMA):
import torch
class EMA():
def __init__(self, alpha):
self.alpha = alpha # 初始化平滑因子alpha
self.average = None # 初始化平均值为空
self.count = 0 # 初始化计数器为0
def update(self, x):
if self.average is None: # 如果平均值为空,则将其初始化为与x相同大小的全零张量
self.average = torch.zeros_like(x)
self.average = self.alpha * x + (1 - self.alpha) * self.average # 更新平均值
self.count += 1 # 更新计数器
def get(self):
return self.average / (1 - self.alpha ** self.count) # 根据计数器和平滑因子计算EMA值,并返回平均值除以衰减系数的结果
In this class, we define three methods, namely __init__
, update
and get
. __init__
The method is used to initialize the smoothing factor alpha
, average average
and counter count
, update
the method is used to update the EMA value, and get
the method is used to obtain the final EMA value.
update
When using this class, we can first instantiate an EMA object, then call the method to update the EMA value at each time step , and finally call get
the method to get the final EMA value. For example:
ema = EMA(alpha=0.5)
for value in data:
ema.update(torch.tensor(value))
smoothed_data = ema.get()
In this example, we alpha=0.5
initialize the EMA object using , then iterate through each data point in data
the dataset and call update
methods to update the EMA value. Finally we call get
the method to get the smoothed data.
Let’s use EMA together! Make training smoother and the model more stable!