the size the each input sample
the size of the module output of each sample
Computing the gradient of the module with respect to its own parameters.
Computing the gradient of the module with respect to its own parameters. Many modules do not perform this step as they do not have any parameters. The state variable name for the parameters is module dependent. The module is expected to accumulate the gradients with respect to the parameters in some variable.
Find a module with given name.
Find a module with given name. If there is no module with given name, it will return None. If there are multiple modules with the given name, an exception will be thrown.
Performs a back-propagation step through the module, with respect to the given input.
Performs a back-propagation step through the module, with respect to the given input. In general this method makes the assumption forward(input) has been called before, with the same input. This is necessary for optimization reasons. If you do not respect this rule, backward() will compute incorrect gradients.
input data
gradient of next layer
gradient corresponding to input data
get execution engine type
get execution engine type
Clear cached activities to save storage space or network bandwidth.
Clear cached activities to save storage space or network bandwidth. Note that we use Tensor.set to keep some information like tensor share
The subclass should override this method if it allocate some extra resource, and call the super.clearState in the override method
Copy the useful running status from src to this.
Copy the useful running status from src to this.
The subclass should override this method if it has some parameters besides weight and bias. Such as runningMean and runningVar of BatchNormalization.
source Module
this
use ValidationMethod to evaluate module
use ValidationMethod to evaluate module
dataset for test
validation methods
total batchsize of all partitions, optional param and default 4 * partitionNum of dataset
Takes an input object, and computes the corresponding output of the module.
Takes an input object, and computes the corresponding output of the module. After a forward, the output state variable should have been updated to the new value.
input data
output data
Get the module name, default name is className@namePostfix
Float or Double
This method compact all parameters and gradients of the model into two tensors.
This method compact all parameters and gradients of the model into two tensors. So it's easier to use optim method
This function returns a table contains ModuleName, the parameter names and parameter value in this module.
This function returns a table contains ModuleName, the parameter names and parameter value in this module. The result table is a structure of Table(ModuleName -> Table(ParameterName -> ParameterValue)), and the type is Table[String, Table[String, Tensor[T]]].
For example, get the weight of a module named conv1: table[Table]("conv1")[Tensor[T]]("weight").
Custom modules should override this function if they have parameters.
Table
Get the scale of gradientBias
Get the scale of gradientBias
Get the scale of gradientWeight
Get the scale of gradientWeight
Get weight and bias for the module
The cached gradient of activities.
The cached gradient of activities. So we don't compute it again when need it
the size the each input sample
Some other modules point to current module
Some other modules point to current module
upstream module nodes
node containing current module
copy weights from another model, mapping by layer name
copy weights from another model, mapping by layer name
model to copy from
whether to match all layers' weights and bias,
current module
load pretrained weights and bias to current module
load pretrained weights and bias to current module
file to store weights and bias
whether to match all layers' weights and bias, if not, only load existing pretrained weights and bias
current module
The cached output.
The cached output. So we don't compute it again when need it
the size of the module output of each sample
This function returns two arrays.
This function returns two arrays. One for the weights and the other the gradients Custom modules should override this function if they have parameters
(Array of weights, Array of grad)
module predict, return the probability distribution
module predict, return the probability distribution
dataset for prediction
module predict, return the predict label
module predict, return the predict label
dataset for prediction
Save this module to path.
Save this module to path.
path to save module, local file system, HDFS and Amazon S3 is supported. HDFS path should be like "hdfs://[host]:[port]/xxx" Amazon S3 path should be like "s3a://bucket/xxx"
if overwrite
self
save weights and bias to file
save weights and bias to file
file to save
whether to overwrite or not
The scale of gradient weight and gradient bias before gradParameters being accumulated.
The scale of gradient weight and gradient bias before gradParameters being accumulated.
Set the module name
Set the scale of gradientBias
Set the scale of gradientBias
the value of the scale of gradientBias
this
Set the scale of gradientWeight
Set the scale of gradientWeight
the value of the scale of gradientWeight
this
Set weight and bias for the module
Set weight and bias for the module
array of weights and bias
Module status.
Module status. It is useful for modules like dropout/batch normalization
Computing the gradient of the module with respect to its own input.
Computing the gradient of the module with respect to its own input. This is returned in gradInput. Also, the gradInput state variable is updated accordingly.
Computes the output using the current parameter set of the class and input.
Computes the output using the current parameter set of the class and input. This function returns the result which is stored in the output field.
If the module has parameters, this will zero the accumulation of the gradients with respect to these parameters.
If the module has parameters, this will zero the accumulation of the gradients with respect to these parameters. Otherwise, it does nothing.
The
Linear
module applies a linear transformation to the input data, i.e.y = Wx + b
. Theinput
given inforward(input)
must be either a vector (1D tensor) or matrix (2D tensor). If the input is a vector, it must have the size ofinputSize
. If it is a matrix, then each row is assumed to be an input sample of given batch (the number of rows means the batch size and the number of columns should be equal to theinputSize
).