if this optim method owns the learning rate scheduler. A scheduler may be shared by multiple LARS scheduler
the trust on the learning rate scale, should be in 0 to 1
learning rate
learning rate decay
weight decay
momentum
the learning rate scheduler
Clear the history information in the OptimMethod state
Clear the history information in the OptimMethod state
clone OptimMethod
clone OptimMethod
dampening for momentum
dampening for momentum
return an string of current hyperParameter.
return an string of current hyperParameter.
a table contains the hyper parameter.
return an string of current hyperParameter.
return an string of current hyperParameter.
get learning rate
get learning rate
learning rate
learning rate
learning rate decay
learning rate decay
1D tensor of individual learning rates
1D tensor of individual learning rates
load optimMethod parameters from Table
load optimMethod parameters from Table
momentum
momentum
enables Nesterov momentum
enables Nesterov momentum
a function that takes a single input (X), the point of a evaluation, and returns f(X) and df/dX
the initial point
the new x vector and the function list {fx}, evaluated before the update
save OptimMethod
Update hyper parameter.
Update hyper parameter. We have updated hyper parameter in method optimize(). But in DistriOptimizer, the method optimize() is only called on the executor side, the driver's hyper parameter is unchanged. So this method is using to update hyper parameter on the driver side.
config table.
state Table.
A string.
Update hyper parameter.
Update hyper parameter. We have updated hyper parameter in method optimize(). But in DistriOptimizer, the method optimize() is only called on the executor side, the driver's hyper parameter is unchanged. So this method is using to update hyper parameter on the driver side.
A string.
weight decay
weight decay
1D tensor of individual weight decays
1D tensor of individual weight decays
Clear the history information in the state
Clear the history information in the state
(Since version 0.2.0) Please use clearHistory() instead
Optimize the model parameter
Optimize the model parameter
a function that takes a single input (X), the point of a evaluation, and returns f(X) and df/dX
the initial point
a table with configuration parameters for the optimizer
a table describing the state of the optimizer; after each call the state is modified
the new x vector and the function list, evaluated before the update
(Since version 0.2.0) Please initialize OptimMethod with parameters when creating it instead of importing table
An implementation of LARS https://arxiv.org/abs/1708.03888 Lars.createOptimForModule is recommended to be used to create LARS optim methods for multiple layers