RAdam-Tensorflow
On the Variance of the Adaptive Learning Rate and Beyond
Paper | Official Pytorch code
Usage
from RAdam import RAdamOptimizer
train_op = RAdamOptimizer(learning_rate=0.001, beta1=0.9, beta2=0.999, weight_decay=0.0).minimize(loss)
RAdam implemented in Tensorflow 1.x
pip install tf-1.x-rectified-adam==0.0.2
from RAdam import RAdamOptimizer
train_op = RAdamOptimizer(learning_rate=0.001, beta1=0.9, beta2=0.999, weight_decay=0.0).minimize(loss)