ADOPT: A Modified Adam Optimizer with Guaranteed Convergence for Any Beta-2 Value
A new modification to Adam called ADOPT enables optimal convergence rates regardless of the β₂ parameter choice. The key insight is adding a simple term to Adam's update rule that compensates for potential convergence issues when β₂ is set suboptim…