Understanding how EMA works with an simulation of a toy model with toy weights
Published
August 14, 2023
Exponential Moving Average (EMA) in Weight Updates
EMA (Exponential Moving Average) is an incredibly useful concept that finds application in various scenarios:
Weight Updates: EMA is used for updating model weights while retaining a historical record of previous weights. This enables the model to blend new information with past knowledge effectively.
Self-Supervised Learning: EMA is commonly employed in Self-Supervised Learning setups. The weights obtained from Self-Supervised Learning are often utilized for downstream tasks like classification and segmentation.
Clarification on EMA’s Impact
Initially, there was a misconception that the EMA process directly impacts the ongoing training of model weights. However, this is not the case. In reality, the EMA process involves the creation of a duplicated set of weights. These duplicate weights are updated alongside the primary training process, and the updated weights are subsequently leveraged for validation purposes. As a result, the overall training procedure remains unaffected by the EMA process.