Computing the gradient of the module with respect to its own parameters.
Computing the gradient of the module with respect to its own parameters. Many modules do not perform this step as they do not have any parameters. The state variable name for the parameters is module dependent. The module is expected to accumulate the gradients with respect to the parameters in some variable.
Find a module with given name.
Find a module with given name. If there is no module with given name, it will return None. If there are multiple modules with the given name, an exception will be thrown.
Performs a back-propagation step through the module, with respect to the given input.
Performs a back-propagation step through the module, with respect to the given input. In general this method makes the assumption forward(input) has been called before, with the same input. This is necessary for optimization reasons. If you do not respect this rule, backward() will compute incorrect gradients.
input data
gradient of next layer
gradient corresponding to input data
Clear cached activities to save storage space or network bandwidth.
Clear cached activities to save storage space or network bandwidth. Note that we use Tensor.set to keep some information like tensor share
The subclass should override this method if it allocate some extra resource, and call the super.clearState in the override method
Clone the module, deep or shallow copy
Clone the model
use ValidationMethod to evaluate module on the given local dataset
use ValidationMethod to evaluate module on the given local dataset
use ValidationMethod to evaluate module on the given rdd dataset
use ValidationMethod to evaluate module on the given rdd dataset
dataset for test
validation methods
total batchsize of all partitions, optional param and default 4 * partitionNum of dataset
Set the module to evaluate mode
use ValidationMethod to evaluate module on the given ImageFrame
use ValidationMethod to evaluate module on the given ImageFrame
ImageFrame for valudation
validation methods
total batch size of all partitions
Takes an input object, and computes the corresponding output of the module.
Takes an input object, and computes the corresponding output of the module. After a forward, the output state variable should have been updated to the new value.
input data
output data
freeze the module, i.
freeze the module,
i.e. their parameters(weight/bias, if exists) are not changed in training process
if names is not empty,
set an array of layers that match the given
to be "freezed",
names
an array of layer names
current graph model
Return classTag numerics for module serialization.
Return classTag numerics for module serialization. If your module contains multiple classtag in the constructor, you should override this method
Get extra parameter in this module.
Get extra parameter in this module. Extra parameter means the trainable parameters beside weight and bias. Such as runningMean and runningVar in BatchNormalization.
The subclass should override this method if it has some parameters besides weight and bias.
an array of tensor
Return the inputShape for the current Layer and the first dim is batch.
Return the inputShape for the current Layer and the first dim is batch.
Get the module name, default name is className@namePostfix
Get numeric type of module parameters
Return the outputShape for the current Layer and the first dim is batch.
Return the outputShape for the current Layer and the first dim is batch.
This function returns a table contains ModuleName, the parameter names and parameter value in this module.
This function returns a table contains ModuleName, the parameter names and parameter value in this module.
The result table is a structure of Table(ModuleName -> Table(ParameterName -> ParameterValue)), and the type is Table[String, Table[String, Tensor[T]]].
For example, get the weight of a module named conv1: table[Table]("conv1")[Tensor[T]]("weight").
The names of the parameters follow such convention:
1. If there's one parameter, the parameter is named as "weight", the gradient is named as "gradWeight"
2. If there're two parameters, the first parameter is named as "weight", the first gradient is named as "gradWeight"; the second parameter is named as "bias", the seconcd gradient is named as "gradBias"
3. If there're more parameters, the weight is named as "weight" with a seq number as suffix, the gradient is named as "gradient" with a seq number as suffix
Custom modules should override this function the default impl if the convention doesn't meet the requirement.
Table
Get the scale of gradientBias
Get the scale of gradientBias
Get the scale of gradientWeight
Get the scale of gradientWeight
Get the forward/backward cost time for the module or its submodules
Get the forward/backward cost time for the module or its submodules
Get the forward/backward cost time for the module or its submodules and group by module type.
Get the forward/backward cost time for the module or its submodules and group by module type.
(module type name, forward time, backward time)
Get weight and bias for the module
The cached gradient of activities.
The cached gradient of activities. So we don't compute it again when need it
Whether user set a name to the module before
Build graph: some other modules point to current module
Build graph: some other modules point to current module
distinguish from another inputs when input parameter list is empty
upstream module nodes and the output tensor index. The start index is 1.
node containing current module
Build graph: some other modules point to current module
Build graph: some other modules point to current module
upstream module nodes in an array
node containing current module
Build graph: some other modules point to current module
Build graph: some other modules point to current module
upstream module nodes
node containing current module
Check if the model is in training mode
copy weights from another model, mapping by layer name
copy weights from another model, mapping by layer name
model to copy from
whether to match all layers' weights and bias,
current module
load pretrained weights and bias to current module
load pretrained weights and bias to current module
file to store weights and bias
whether to match all layers' weights and bias, if not, only load existing pretrained weights and bias
current module
The cached output.
The cached output. So we don't compute it again when need it
This function returns two arrays.
This function returns two arrays. One for the weights and the other the gradients Custom modules should override this function if they have parameters
(Array of weights, Array of grad)
module predict, return the probability distribution
module predict, return the probability distribution
dataset for prediction
total batchSize for all partitions. if -1, default is 4 * partitionNumber of datatset
whether to share same memory for each batch predict results
module predict, return the predict label
module predict, return the predict label
dataset for prediction
total batchSize for all partitions. if -1, default is 4 * partitionNumber of dataset
model predict images, return imageFrame with predicted tensor, if you want to call predictImage multiple times, it is recommended to use Predictor for DistributedImageFrame or LocalPredictor for LocalImageFrame
model predict images, return imageFrame with predicted tensor, if you want to call predictImage multiple times, it is recommended to use Predictor for DistributedImageFrame or LocalPredictor for LocalImageFrame
imageFrame that contains images
if outputLayer is not null, the output of layer that matches outputLayer will be used as predicted output
whether to share same memory for each batch predict results
batch size per partition, default is 4
key to store predicted result
featurePaddingParam if the inputs have variant size
Quantize this module, which reduces the precision of the parameter.
Quantize this module, which reduces the precision of the parameter. Get a higher speed with a little accuracy cost.
if the model contains native resources such as aligned memory, we should release it by manual.
if the model contains native resources such as aligned memory, we should release it by manual. JVM GC can't release them reliably.
Reset module parameters, which is re-initialize the parameter with given initMethod
Reset module parameters, which is re-initialize the parameter with given initMethod
Reset the forward/backward record time for the module or its submodules
Reset the forward/backward record time for the module or its submodules
Save this module to path in caffe readable format
Save this module to path in caffe readable format
Save this module definition to path.
Save this module definition to path.
path to save module, local file system, HDFS and Amazon S3 is supported. HDFS path should be like "hdfs://[host]:[port]/xxx" Amazon S3 path should be like "s3a://bucket/xxx"
if overwrite
self
Save this module to path with protobuf format
Save this module to path with protobuf format
path to save module, local file system, HDFS and Amazon S3 is supported. HDFS path should be like "hdfs://[host]:[port]/xxx" Amazon S3 path should be like "s3a://bucket/xxx"
where to store weight
if overwrite
self
Save this module to path in tensorflow readable format
Save this module to path in tensorflow readable format
Save this module to path in torch7 readable format
Save this module to path in torch7 readable format
save weights and bias to file
save weights and bias to file
file to save
whether to overwrite or not
The scale of gradient weight and gradient bias before gradParameters being accumulated.
The scale of gradient weight and gradient bias before gradParameters being accumulated.
Set extra parameter to this module.
Set extra parameter to this module. Extra parameter means the trainable parameters beside weight and bias. Such as runningMean and runningVar in BatchNormalization.
this
Set the line separator when print the module
Set the module name
Set the scale of gradientBias
Set the scale of gradientBias
the value of the scale of gradientBias
this
Set the scale of gradientWeight
Set the scale of gradientWeight
the value of the scale of gradientWeight
this
Set weight and bias for the module
Set weight and bias for the module
array of weights and bias
Generate graph module with start nodes
Module status.
Module status. It is useful for modules like dropout/batch normalization
Set the module to training mode
"unfreeze" module, i.
"unfreeze" module, i.e. make the module parameters(weight/bias, if exists) to be trained(updated) in training process if names is not empty, unfreeze layers that match given names
array of module names to unFreeze
Computing the gradient of the module with respect to its own input.
Computing the gradient of the module with respect to its own input. This is returned in gradInput. Also, the gradInput state variable is updated accordingly.
Computes the output using the current parameter set of the class and input.
Computes the output using the current parameter set of the class and input. This function returns the result which is stored in the output field.
If the module has parameters, this will zero the accumulation of the gradients with respect to these parameters.
If the module has parameters, this will zero the accumulation of the gradients with respect to these parameters. Otherwise, it does nothing.
Save this module to path.
Save this module to path.
path to save module, local file system, HDFS and Amazon S3 is supported. HDFS path should be like "hdfs://[host]:[port]/xxx" Amazon S3 path should be like "s3a://bucket/xxx"
if overwrite
self
(Since version 0.3.0) please use recommended saveModule(path, overWrite)
Computes the grayscale dilation of 4-D
input
and 3-Dfilter
tensors.This layer takes a Table of two tensors as inputs, namely
input
andfilter
. Theinput
tensor has shape[batch, in_height, in_width, depth]
and thefilter
tensor has shape[filter_height, filter_width, depth]
, i.e., each input channel is processed independently of the others with its own structing fucntion. Theoutput
tensor has shape[batch, out_height, out_width, depth]
. The spatial dimensions of the output tensor depend on thepadding
algorithm. We currently only support the "NHWC" DataFormat.In detail, the grayscale morphological 2-D dilation is the max-sum correlation
output[b, y, x, c] = max_{dy, dx} input[b, strides[1] * y + rates[1] * dy, strides[2] * x + rates[2] * dx, c] + filter[dy, dx, c]
Max-pooling is a special case when the filter has size equal to the pooling kernel size and contains all zeros.
Note on duality: The dilation of
input
by thefilter
is equal to the negation of the erosion of-input
by the reflectedfilter
.