an element-wise abs operation
measures the mean absolute value of the element-wise difference between input and target
adds a bias term to input data ;
adding a constant
Generates a regular grid of multi-scale, multi-aspect anchor boxes.
This loss function measures the Binary Cross Entropy between the target and the output loss(o, t) = - 1/n sum_i (t[i] * log(o[i]) + (1 - t[i]) * log(1 - o[i])) or in the case of the weights argument being specified: loss(o, t) = - 1/n sum_i weights[i] * (t[i] * log(o[i]) + (1 - t[i]) * log(1 - o[i]))
This layer implements Batch Normalization as described in the paper: "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift" by Sergey Ioffe, Christian Szegedy https://arxiv.
This layer implement a bidirectional recurrent neural network
Creates a module that takes a Tensor as input and
outputs two tables, splitting the Tensor along
the specified dimension dimension
.
a bilinear transformation with sparse inputs, The input tensor given in forward(input) is a table containing both inputs x_1 and x_2, which are tensors of size N x inputDimension1 and N x inputDimension2, respectively.
Threshold input Tensor.
This class is an implementation of Binary TreeLSTM (Constituency Tree LSTM).
Bottle allows varying dimensionality input to be forwarded through any module that accepts input of nInputDim dimensions, and generates output of nOutputDim dimensions.
This layer has a bias tensor with given size.
Merge the input tensors in the input table by element wise adding them together.
Merge the input tensors in the input table by element wise taking the average.
Takes a table with two Tensor and returns the component-wise division between them.
Takes a table of Tensors and outputs the max of all of them.
Takes a table of Tensors and outputs the min of all of them.
This layer has a weight tensor with given size.
Takes a table of Tensors and outputs the multiplication of all of them.
Takes a table with two Tensor and returns the component-wise subtraction between them.
This is same with cross entropy criterion, except the target tensor is a one-hot tensor
The Cell class is a super class of any recurrent kernels, such as RnnCell, LSTM and GRU.
A kind of hard tanh activition function with integer min and max
The negative log likelihood criterion.
ClassSimplexCriterion implements a criterion for classification.
Concat concatenates the output of one layer of "parallel"
modules along the provided dimension
: they take the
same inputs, and their output is concatenated.
ConcateTable is a container module like Concate.
Initializer that generates tensors with certain constant double.
Container is an abstract AbstractModule class which declares methods defined in all containers.
used to make input, gradOutput both contiguous
Convolution Long Short Term Memory architecture with peephole.
Convolution Long Short Term Memory architecture with peephole.
Cosine calculates the cosine similarity of the input to k mean centers.
outputs the cosine distance between inputs
Creates a criterion that measures the loss given an input tensor and target tensor.
Creates a criterion that measures the loss given an input x = {x1, x2}, a table of two Tensors, and a Tensor label y with values 1 or -1.
The negative of the mean cosine proximity between predictions and targets.
Cropping layer for 2D input (e.
Cropping layer for 3D data (e.
This criterion combines LogSoftMax and ClassNLLCriterion in one single class.
A layer which takes a table of multiple tensors(n >= 2) as input
and calculate to dot product for all combinations of pairs
among input tensors.
Convert DenseTensor to SparseTensor.
Post process Faster-RCNN models
Layer to Post-process SSD output
The Dice-Coefficient criterion input: Tensor, target: Tensor
The Kullback–Leibler divergence criterion
This is a simple table layer which takes a table of two tensors as input and calculate the dot product between them as outputs
Compute the dot product of input and target tensor.
Dropout masks(set to zero) parts of input using a bernoulli distribution.
DynamicContainer allow user to change its submodules after create it.
Djork-Arné Clevert, Thomas Unterthiner, Sepp Hochreiter Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) [http://arxiv.
This module is for debug purpose, which can print activation and gradient in your model topology
Outputs the Euclidean distance of the input to outputSize
centers
Applies element-wise exp to input tensor.
This is a table layer which takes an arbitrarily deep table of Tensors (potentially nested) as input and a table of Tensors without any nested table will be produced
Manage frame in scheduler.
Gated Recurrent Units architecture.
Computes the log-likelihood of a sample x given a Gaussian distribution p.
Apply multiplicative 1-centered Gaussian noise.
Apply additive zero-centered Gaussian noise.
Takes {mean, log_variance} as input and samples from the Gaussian distribution
It is a simple module preserves the input, but takes the gradient from the subsequent layer, multiplies it by -lambda and passes it to the preceding layer.
A graph container.
This is a transfer layer which applies the hard shrinkage function element-wise to the input Tensor.
Apply Segment-wise linear approximation of sigmoid.
Applies HardTanh to each element of input, HardTanh is defined: ⎧ maxValue, if x > maxValue f(x) = ⎨ minValue, if x < minValue ⎩ x, otherwise
Creates a criterion that measures the loss given an input x which is a 1-dimensional vector and a label y (1 or -1).
Identity just return the input to output.
Applies the Tensor index operation along the given dimension.
Reshape the input tensor with automatic size inference support.
Initialization method to initialize bias and weight.
Input layer do nothing to the input tensors, just pass them.
It is a table module which takes a table of Tensors as input and
outputs a Tensor by joining them together along the dimension dimension
.
Computes the KL-divergence of the input normal distribution to a standard normal distribution.
This method is same as kullback_leibler_divergence
loss in keras.
compute L1 norm for input, and sign of input
Creates a criterion that measures the loss given an input x = {x1, x2}, a table of two Tensors, and a label y (1 or -1):
adds an L1 penalty to an input (for sparsity).
Long Short Term Memory architecture.
Long Short Term Memory architecture with peephole.
It is a transfer module that applies LeakyReLU, which parameter negval sets the slope of the negative part: LeakyReLU is defined as: f(x) = max(0, x) + negval * min(0, x)
The Linear
module applies a linear transformation to the input data,
i.
The LocallyConnected2D layer works similarly to the SpatialConvolution layer, except that weights are unshared, that is, a different set of filters is applied at each different patch of the input.
The Log module applies a log transformation to the input data
This class is a transform layer corresponding to the sigmoid function: f(x) = Log(1 / (1 + e ^^ (-x)))
The LogSoftMax module applies a LogSoftMax transformation to the input data which is defined as: f_i(x) = log(1 / a exp(x_i)) where a = sum_j[exp(x_j)]
This layer is a particular case of a convolution, where the width of the convolution would be 1.
LookupTable for multi-values.
Module to perform matrix multiplication on two mini-batch inputs, producing a mini-batch.
The mean squared error criterion e.
It is a module to perform matrix vector multiplication on two mini-batch inputs, producing a mini-batch.
This class is a container for a single module which will be applied to all input elements.
Creates a criterion that optimizes a two-class classification (squared) hinge loss (margin-based loss) between input x (a Tensor of dimension 1) and output y.
Creates a criterion that measures the loss given an input x = {x1, x2}, a table of two Tensors of size 1 (they contain only scalars), and a label y (1 or -1).
Performs a torch.
Masking Use a mask value to skip timesteps for a sequence
Applies a max operation over dimension dim
Maxout A linear maxout layer Maxout layer select the element-wise maximum value of maxoutNumber Linear(inputSize, outputSize) layers
It is a simple layer which applies a mean operation over the given dimension.
This method is same as mean_absolute_percentage_error
loss in keras.
This method is same as mean_squared_logarithmic_error
loss in keras.
Applies a min operation over dimension dim
.
Creates a module that takes a table {gater, experts} as input and outputs the mixture of experts (a Tensor or table of Tensors) using a gater Tensor.
A Filler based on the paper [He, Zhang, Ren and Sun 2015]: Specifically accounts for ReLU nonlinearities.
multiply a single scalar factor to the incoming data
Multiplies input Tensor by a (non-learnable) scalar constant.
a weighted sum of other criterions each applied to the same input and target;
Creates a criterion that optimizes a multi-class multi-classification hinge loss (margin-based loss) between input x and output y (which is a Tensor of target class indices)
A MultiLabel multiclass criterion based on sigmoid:
Creates a criterion that optimizes a multi-class classification hinge loss (margin-based loss) between input x and output y (which is a target class index).
Enable user stack multiple simple cells.
Narrow is application of narrow operation in a module.
Creates a module that takes a table as input and outputs the subtable starting at index offset having length elements (defaults to 1 element).
Computing negative value of each element of input tensor
Penalize the input multinomial distribution if it has low entropy.
Non-Maximum Suppression (nms) for Object Detection The goal of nms is to solve the problem that groups of several detections near the real location, ideally obtaining only one detection per object
Normalizes the input Tensor to have unit L_p norm.
NormalizeScale is conposed of normalize and scale, this is equal to caffe Normalize layer
The Criterion to compute the negative policy gradient given a multinomial distribution and the sampled action and reward.
Applies parametric ReLU, which parameter varies the slope of the negative part.
Stacks a list of n-dimensional tensors into one (n+1)-dimensional tensor.
This module adds pad units of padding to dimension dim of the input.
It is a module that takes a table of two vectors as input and outputs the distance between them using the p-norm.
ParallelCriterion is a weighted sum of other criterions each applied to a different input and target.
It is a container module that applies the i-th member module to the i-th input, and outputs an output in the form of Table
This class is same as Poisson
loss in keras.
Apply an element-wise power operation with scale and shift.
Generate the prior boxes of designated sizes and aspect ratios across all dimensions (H * W) Intended for use with MultiBox detection method to generate prior
Outputs object detection proposals by applying estimated bounding-box transformations to a set of regular boxes (called "anchors").
Applies the randomized leaky rectified linear unit (RReLU) element-wise to the input Tensor, thus outputting a Tensor of the same dimension.
Initializer that generates tensors with a normal distribution.
Initializer that generates tensors with a uniform distribution.
Applies the rectified linear unit (ReLU) function element-wise to the input Tensor Thus the output is a Tensor of the same dimension ReLU function is defined as: f(x) = max(0, x)
Same as ReLU except that the rectifying function f(x) saturates at x = 6
ReLU6 is defined as:
f(x) = min(max(0, x), 6)
Recurrent module is a container of rnn cells Different types of rnn cells can be added using add() function
RecurrentDecoder module is a container of rnn cells that used to make a prediction of the next timestep based on the prediction we made from the previous timestep.
Replicate repeats input nFeatures
times along its dim
dimension
The forward(input)
reshape the input tensor into a
size(0) * size(1) * ...
tensor, taking the elements row-wise.
Resize the input image with bilinear interpolation.
Reverse the input w.
Implementation of vanilla recurrent neural network cell i2h: weight matrix of input to hidden units h2h: weight matrix of hidden units to themselves through time The updating is defined as: h_t = f(i2h * x_t + h2h * h_{t-1})
Region of interest pooling The RoIPooling uses max pooling to convert the features inside any valid region of interest into a small feature map with a fixed spatial extent of pooledH × pooledW (e.
S-shaped Rectified Linear Unit.
Scale is the combination of cmul and cadd Computes the elementwise product of input and weight, with the shape of the weight "expand" to match the shape of the input.
A Simple layer selecting an index of the input tensor in the given dimension
Creates a module that takes a table as input and outputs the element at index index
(positive or negative).
Sequential provides a means to plug layers together in a feed-forward fully connected manner.
Applies the Sigmoid function element-wise to the input Tensor, thus outputting a Tensor of the same dimension.
Creates a criterion that can be thought of as a smooth version of the AbsCriterion.
a smooth version of the AbsCriterion It uses a squared term if the absolute element-wise error falls below 1.
Creates a criterion that optimizes a two-class classification logistic loss between input x (a Tensor of dimension 1) and output y (which is a tensor containing either 1s or -1s).
Applies the SoftMax function to an n-dimensional input Tensor, rescaling them so that the elements of the n-dimensional output Tensor lie in the range (0, 1) and sum to 1.
Applies the SoftMin function to an n-dimensional input Tensor, rescaling them so that the elements of the n-dimensional output Tensor lie in the range (0,1) and sum to 1.
Apply the SoftPlus function to an n-dimensional input tensor.
Apply the soft shrinkage function element-wise to the input Tensor
Apply SoftSign function to an n-dimensional input Tensor.
Computes the multinomial logistic loss for a one-of-many classification task, passing real-valued predictions through a softmax to get a probability distribution over classes.
:: Experimental ::
SparseLinear is the sparse version of module Linear.
Applies 2D average-pooling operation in kWxkH regions by step size dWxdH steps.
This file implements Batch Normalization as described in the paper: "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift" by Sergey Ioffe, Christian Szegedy This implementation is useful for inputs coming from convolution layers.
Subtractive + divisive contrast normalization.
Applies a 2D convolution over an input image composed of several input planes.
This class is a generalization of SpatialConvolution.
Applies Spatial Local Response Normalization between different feature maps.
Apply a 2D dilated convolution over an input image.
Applies a spatial division operation on a series of 2D inputs using kernel for computing the weighted average in a neighborhood.
This version performs the same function as Dropout, however it drops entire 1D feature maps instead of individual elements.
This version performs the same function as Dropout, however it drops entire 2D feature maps instead of individual elements.
This version performs the same function as Dropout, however it drops entire 3D feature maps instead of individual elements.
Apply a 2D full convolution over an input image.
Applies 2D max-pooling operation in kWxkH regions by step size dWxdH steps.
Separable convolutions consist in first performing a depthwise spatial convolution (which acts on each input channel separately) followed by a pointwise convolution which mixes together the resulting output channels.
Applies a spatial subtraction operation on a series of 2D inputs using kernel for computing the weighted average in a neighborhood.
The local response normalization layer performs a kind of “lateral inhibition” by normalizing over local input regions.
Each feature map of a given input is padded with specified number of zeros.
Creates a module that takes a Tensor as input and
outputs several tables, splitting the Tensor along
the specified dimension dimension
.
Apply an element-wise sqrt operation.
Apply an element-wise square operation.
Delete all singleton dimensions or a specific singleton dimension.
A graph container.
It is a simple layer which applies a sum operation over the given dimension.
Applies the Tanh function element-wise to the input Tensor, thus outputting a Tensor of the same dimension.
A simple layer for each element of the input tensor, do the following operation during the forward process: [f(x) = tanh(x) - 1]
Applies a 1D convolution over an input sequence composed of nInputFrame frames.
Applies 1D max-pooling operation in kW regions by step size dW steps.
TensorTree class is used to decode a tensor to a tree structure.
Threshold input Tensor.
Tile repeats input nFeatures
times along its dim
dimension
This layer is intended to apply contained layer to each temporal time slice of input tensor.
This class is intended to support inputs with 3 or more dimensions.
This class is intended to support inputs with 3 or more dimensions.
The criterion that takes two modules to transform input and target, and take one criterion to compute the loss with the transformed input and target.
Transpose input along specified dimensions
Insert singleton dim (i.
Upsampling layer for 1D inputs.
Upsampling layer for 2D inputs.
Upsampling layer for 3D inputs.
VariableFormat describe the meaning of each dimension of the variable (the trainable parameters of a model like weight and bias) and can be used to return the fan in and fan out size of the variable when provided the variable shape.
This module creates a new view of the input tensor using the sizes passed to the constructor.
Applies 3D average-pooling operation in kTxkWxkH regions by step size dTxdWxdH.
Applies a 3D convolution over an input image composed of several input planes.
Apply a 3D full convolution over an 3D input image, a sequence of images, or a video etc.
Applies 3D max-pooling operation in kTxkWxkH regions by step size dTxdWxdH.
Initialize the weight with coefficients for bilinear interpolation.
Initializer that generates tensors with zeros.
Initializer that generates tensors with a uniform distribution.
In short, it helps signals reach deep into the network.
Initializer that generates tensors with zeros.