bigdl.nn package

Submodules

bigdl.nn.criterion module

class bigdl.nn.criterion.AbsCriterion(size_average=True, bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

measures the mean absolute value of the element-wise difference between input

>>> absCriterion = AbsCriterion(True)
creating: createAbsCriterion
class bigdl.nn.criterion.BCECriterion(weights=None, size_average=True, bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

Creates a criterion that measures the Binary Cross Entropy between the target and the output

Parameters:
  • weights – weights for each class
  • sizeAverage – whether to average the loss or not
>>> np.random.seed(123)
>>> weights = np.random.uniform(0, 1, (2,)).astype("float32")
>>> bCECriterion = BCECriterion(weights)
creating: createBCECriterion
>>> bCECriterion = BCECriterion()
creating: createBCECriterion
class bigdl.nn.criterion.ClassNLLCriterion(weights=None, size_average=True, bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

The negative log likelihood criterion. It is useful to train a classification problem with n classes. If provided, the optional argument weights should be a 1D Tensor assigning weight to each of the classes. This is particularly useful when you have an unbalanced training set.

The input given through a forward() is expected to contain log-probabilities of each class: input has to be a 1D Tensor of size n. Obtaining log-probabilities in a neural network is easily achieved by adding a LogSoftMax layer in the last layer of your neural network. You may use CrossEntropyCriterion instead, if you prefer not to add an extra layer to your network. This criterion expects a class index (1 to the number of class) as target when calling forward(input, target) and backward(input, target).

The loss can be described as: loss(x, class) = -x[class] or in the case of the weights argument it is specified as follows: loss(x, class) = -weights[class] * x[class] Due to the behaviour of the backend code, it is necessary to set sizeAverage to false when calculating losses in non-batch mode.

Note that if the target is -1, the training process will skip this sample. In other will, the forward process will return zero output and the backward process will also return zero gradInput.

By default, the losses are averaged over observations for each minibatch. However, if the field sizeAverage is set to false, the losses are instead summed for each minibatch.

Parameters:
  • weights – weights of each class
  • size_average – whether to average or not
>>> np.random.seed(123)
>>> weights = np.random.uniform(0, 1, (2,)).astype("float32")
>>> classNLLCriterion = ClassNLLCriterion(weights,True)
creating: createClassNLLCriterion
>>> classNLLCriterion = ClassNLLCriterion()
creating: createClassNLLCriterion
class bigdl.nn.criterion.ClassSimplexCriterion(n_classes, bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

ClassSimplexCriterion implements a criterion for classification. It learns an embedding per class, where each class’ embedding is a point on an (N-1)-dimensional simplex, where N is the number of classes.

Parameters:nClasses – the number of classes.
>>> classSimplexCriterion = ClassSimplexCriterion(2)
creating: createClassSimplexCriterion
class bigdl.nn.criterion.CosineDistanceCriterion(size_average=True, bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

Creates a criterion that measures the loss given an input and target, Loss = 1 - cos(x, y)

>>> cosineDistanceCriterion = CosineDistanceCriterion(True)
creating: createCosineDistanceCriterion
>>> cosineDistanceCriterion.forward(np.array([1.0, 2.0, 3.0, 4.0, 5.0]),
...                                   np.array([5.0, 4.0, 3.0, 2.0, 1.0]))
0.07272728
class bigdl.nn.criterion.CosineEmbeddingCriterion(margin=0.0, size_average=True, bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

Creates a criterion that measures the loss given an input x = {x1, x2}, a table of two Tensors, and a Tensor label y with values 1 or -1.

Parameters:margin – a number from -1 to 1, 0 to 0.5 is suggested
>>> cosineEmbeddingCriterion = CosineEmbeddingCriterion(1e-5, True)
creating: createCosineEmbeddingCriterion
>>> cosineEmbeddingCriterion.forward([np.array([1.0, 2.0, 3.0, 4.0, 5.0]),
...                                   np.array([5.0, 4.0, 3.0, 2.0, 1.0])],
...                                 [np.ones(5)])
0.0
class bigdl.nn.criterion.Criterion(jvalue, bigdl_type, *args)[source]

Bases: bigdl.util.common.JavaValue

Criterion is helpful to train a neural network. Given an input and a target, they compute a gradient according to a given loss function.

backward(input, target)[source]

NB: It’s for debug only, please use optimizer.optimize() in production. Performs a back-propagation step through the criterion, with respect to the given input.

Parameters:
  • input – ndarray or list of ndarray
  • target – ndarray or list of ndarray
Returns:

ndarray

forward(input, target)[source]

NB: It’s for debug only, please use optimizer.optimize() in production. Takes an input object, and computes the corresponding loss of the criterion, compared with target

Parameters:
  • input – ndarray or list of ndarray
  • target – ndarray or list of ndarray
Returns:

value of loss

classmethod of(jcriterion, bigdl_type='float')[source]

Create a python Criterion by a java criterion object

Parameters:jcriterion – A java criterion object which created by Py4j
Returns:a criterion.
class bigdl.nn.criterion.CrossEntropyCriterion(weights=None, size_average=True, bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

This criterion combines LogSoftMax and ClassNLLCriterion in one single class.

Parameters:weights – A tensor assigning weight to each of the classes
>>> np.random.seed(123)
>>> weights = np.random.uniform(0, 1, (2,)).astype("float32")
>>> cec = CrossEntropyCriterion(weights)
creating: createCrossEntropyCriterion
>>> cec = CrossEntropyCriterion()
creating: createCrossEntropyCriterion
class bigdl.nn.criterion.DiceCoefficientCriterion(size_average=True, epsilon=1.0, bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

The Dice-Coefficient criterion input: Tensor,target: Tensor

return:      2 * (input intersection target)
        1 - ----------------------------------
                input union target
>>> diceCoefficientCriterion = DiceCoefficientCriterion(size_average = True, epsilon = 1.0)
creating: createDiceCoefficientCriterion
>>> diceCoefficientCriterion = DiceCoefficientCriterion()
creating: createDiceCoefficientCriterion
class bigdl.nn.criterion.DistKLDivCriterion(size_average=True, bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

The Kullback-Leibler divergence criterion

Parameters:sizeAverage
>>> distKLDivCriterion = DistKLDivCriterion(True)
creating: createDistKLDivCriterion
class bigdl.nn.criterion.GaussianCriterion(bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

Computes the log-likelihood of a sample x given a Gaussian distribution p. >>> GaussianCriterion = GaussianCriterion() creating: createGaussianCriterion

class bigdl.nn.criterion.HingeEmbeddingCriterion(margin=1.0, size_average=True, bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

Creates a criterion that measures the loss given an input x which is a 1-dimensional vector and a label y (1 or -1). This is usually used for measuring whether two inputs are similar or dissimilar, e.g. using the L1 pairwise distance, and is typically used for learning nonlinear embeddings or semi-supervised learning.

If x and y are n-dimensional Tensors, the sum operation still operates over all the elements, and divides by n (this can be avoided if one sets the internal variable sizeAverage to false). The margin has a default value of 1, or can be set in the constructor.

>>> hingeEmbeddingCriterion = HingeEmbeddingCriterion(1e-5, True)
creating: createHingeEmbeddingCriterion
class bigdl.nn.criterion.KLDCriterion(bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

Computes the KL-divergence of the Gaussian distribution. >>> KLDCriterion = KLDCriterion() creating: createKLDCriterion

class bigdl.nn.criterion.L1Cost(bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

compute L1 norm for input, and sign of input

>>> l1Cost = L1Cost()
creating: createL1Cost
class bigdl.nn.criterion.L1HingeEmbeddingCriterion(margin=1.0, bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

Creates a criterion that measures the loss given an input x = {x1, x2}, a table of two Tensors, and a label y (1 or -1):

Parameters:margin
>>> l1HingeEmbeddingCriterion = L1HingeEmbeddingCriterion(1e-5)
creating: createL1HingeEmbeddingCriterion
>>> l1HingeEmbeddingCriterion = L1HingeEmbeddingCriterion()
creating: createL1HingeEmbeddingCriterion
>>> input1 = np.array([2.1, -2.2])
>>> input2 = np.array([-0.55, 0.298])
>>> input = [input1, input2]
>>> target = np.array([1.0])
>>> result = l1HingeEmbeddingCriterion.forward(input, target)
>>> (result == 5.148)
True
class bigdl.nn.criterion.MSECriterion(bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

Creates a criterion that measures the mean squared error between n elements in the input x and output y:

loss(x, y) = 1/n \sum |x_i - y_i|^2

If x and y are d-dimensional Tensors with a total of n elements, the sum operation still operates over all the elements, and divides by n. The two Tensors must have the same number of elements (but their sizes might be different). The division by n can be avoided if one sets the internal variable sizeAverage to false. By default, the losses are averaged over observations for each minibatch. However, if the field sizeAverage is set to false, the losses are instead summed.

>>> mSECriterion = MSECriterion()
creating: createMSECriterion
class bigdl.nn.criterion.MarginCriterion(margin=1.0, size_average=True, bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

Creates a criterion that optimizes a two-class classification hinge loss (margin-based loss) between input x (a Tensor of dimension 1) and output y.

Parameters:
  • margin – if unspecified, is by default 1.
  • size_average – size average in a mini-batch
>>> marginCriterion = MarginCriterion(1e-5, True)
creating: createMarginCriterion
class bigdl.nn.criterion.MarginRankingCriterion(margin=1.0, size_average=True, bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

Creates a criterion that measures the loss given an input x = {x1, x2}, a table of two Tensors of size 1 (they contain only scalars), and a label y (1 or -1). In batch mode, x is a table of two Tensors of size batchsize, and y is a Tensor of size batchsize containing 1 or -1 for each corresponding pair of elements in the input Tensor. If y == 1 then it assumed the first input should be ranked higher (have a larger value) than the second input, and vice-versa for y == -1.

Parameters:margin
>>> marginRankingCriterion = MarginRankingCriterion(1e-5, True)
creating: createMarginRankingCriterion
class bigdl.nn.criterion.MultiCriterion(bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

a weighted sum of other criterions each applied to the same input and target

>>> multiCriterion = MultiCriterion()
creating: createMultiCriterion
>>> mSECriterion = MSECriterion()
creating: createMSECriterion
>>> multiCriterion = multiCriterion.add(mSECriterion)
>>> multiCriterion = multiCriterion.add(mSECriterion)
add(criterion, weight=1.0)[source]
class bigdl.nn.criterion.MultiLabelMarginCriterion(size_average=True, bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

Creates a criterion that optimizes a multi-class multi-classification hinge loss ( margin-based loss) between input x and output y (which is a Tensor of target class indices)

Parameters:size_average – size average in a mini-batch
>>> multiLabelMarginCriterion = MultiLabelMarginCriterion(True)
creating: createMultiLabelMarginCriterion
class bigdl.nn.criterion.MultiLabelSoftMarginCriterion(weights=None, size_average=True, bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

A MultiLabel multiclass criterion based on sigmoid: the loss is:

l(x,y) = - sum_i y[i] * log(p[i]) + (1 - y[i]) * log (1 - p[i])

where p[i] = exp(x[i]) / (1 + exp(x[i])) and with weights:

l(x,y) = - sum_i weights[i] (y[i] * log(p[i]) + (1 - y[i]) * log (1 - p[i]))
>>> np.random.seed(123)
>>> weights = np.random.uniform(0, 1, (2,)).astype("float32")
>>> multiLabelSoftMarginCriterion = MultiLabelSoftMarginCriterion(weights)
creating: createMultiLabelSoftMarginCriterion
>>> multiLabelSoftMarginCriterion = MultiLabelSoftMarginCriterion()
creating: createMultiLabelSoftMarginCriterion
class bigdl.nn.criterion.MultiMarginCriterion(p=1, weights=None, margin=1.0, size_average=True, bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

Creates a criterion that optimizes a multi-class classification hinge loss (margin-based loss) between input x and output y (which is a target class index).

Parameters:
  • p
  • weights
  • margin
  • size_average
>>> np.random.seed(123)
>>> weights = np.random.uniform(0, 1, (2,)).astype("float32")
>>> multiMarginCriterion = MultiMarginCriterion(1,weights)
creating: createMultiMarginCriterion
>>> multiMarginCriterion = MultiMarginCriterion()
creating: createMultiMarginCriterion
class bigdl.nn.criterion.ParallelCriterion(repeat_target=False, bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

ParallelCriterion is a weighted sum of other criterions each applied to a different input and target. Set repeatTarget = true to share the target for criterions.

Use add(criterion[, weight]) method to add criterion. Where weight is a scalar(default 1).

Parameters:repeat_target – Whether to share the target for all criterions.
>>> parallelCriterion = ParallelCriterion(True)
creating: createParallelCriterion
>>> mSECriterion = MSECriterion()
creating: createMSECriterion
>>> parallelCriterion = parallelCriterion.add(mSECriterion)
>>> parallelCriterion = parallelCriterion.add(mSECriterion)
add(criterion, weight=1.0)[source]
class bigdl.nn.criterion.SmoothL1Criterion(size_average=True, bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

Creates a criterion that can be thought of as a smooth version of the AbsCriterion. It uses a squared term if the absolute element-wise error falls below 1. It is less sensitive to outliers than the MSECriterion and in some cases prevents exploding gradients (e.g. see “Fast R-CNN” paper by Ross Girshick).

                      | 0.5 * (x_i - y_i)^2^, if |x_i - y_i| < 1
loss(x, y) = 1/n \sum |
                      | |x_i - y_i| - 0.5,   otherwise

If x and y are d-dimensional Tensors with a total of n elements, the sum operation still operates over all the elements, and divides by n. The division by n can be avoided if one sets the internal variable sizeAverage to false

Parameters:size_average – whether to average the loss
>>> smoothL1Criterion = SmoothL1Criterion(True)
creating: createSmoothL1Criterion
class bigdl.nn.criterion.SmoothL1CriterionWithWeights(sigma, num=0, bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

a smooth version of the AbsCriterion It uses a squared term if the absolute element-wise error falls below 1. It is less sensitive to outliers than the MSECriterion and in some cases prevents exploding gradients (e.g. see “Fast R-CNN” paper by Ross Girshick).

d = (x - y) * w_in
loss(x, y, w_in, w_out)
           | 0.5 * (sigma * d_i)^2 * w_out          if |d_i| < 1 / sigma / sigma
= 1/n \sum |
           | (|d_i| - 0.5 / sigma / sigma) * w_out   otherwise
>>> smoothL1CriterionWithWeights = SmoothL1CriterionWithWeights(1e-5, 1)
creating: createSmoothL1CriterionWithWeights
class bigdl.nn.criterion.SoftMarginCriterion(size_average=True, bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

Creates a criterion that optimizes a two-class classification logistic loss between input x (a Tensor of dimension 1) and output y (which is a tensor containing either 1s or -1s).

loss(x, y) = sum_i (log(1 + exp(-y[i]*x[i]))) / x:nElement()
Parameters:sizeaverage – The normalization by the number of elements in the inputcan be disabled by setting
>>> softMarginCriterion = SoftMarginCriterion(False)
creating: createSoftMarginCriterion
>>> softMarginCriterion = SoftMarginCriterion()
creating: createSoftMarginCriterion
class bigdl.nn.criterion.SoftmaxWithCriterion(ignore_label=None, normalize_mode='VALID', bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

Computes the multinomial logistic loss for a one-of-many classification task, passing real-valued predictions through a softmax to get a probability distribution over classes. It should be preferred over separate SoftmaxLayer + MultinomialLogisticLossLayer as its gradient computation is more numerically stable.

Parameters:
  • ignoreLabel – (optional) Specify a label value thatshould be ignored when computing the loss.
  • normalizeMode – How to normalize the output loss.
>>> softmaxWithCriterion = SoftmaxWithCriterion()
creating: createSoftmaxWithCriterion
>>> softmaxWithCriterion = SoftmaxWithCriterion(1, "FULL")
creating: createSoftmaxWithCriterion
class bigdl.nn.criterion.TimeDistributedCriterion(criterion, size_average=False, bigdl_type='float')[source]

Bases: bigdl.nn.criterion.Criterion

This class is intended to support inputs with 3 or more dimensions. Apply Any Provided Criterion to every temporal slice of an input.

Parameters:
  • criterion – embedded criterion
  • size_average – whether to divide the sequence length
>>> td = TimeDistributedCriterion(ClassNLLCriterion())
creating: createClassNLLCriterion
creating: createTimeDistributedCriterion

bigdl.nn.initialization_method module

class bigdl.nn.initialization_method.BilinearFiller(bigdl_type='float')[source]

Bases: bigdl.nn.initialization_method.InitializationMethod

Initialize the weight with coefficients for bilinear interpolation.

A common use case is with the DeconvolutionLayer acting as upsampling. The variable tensor passed in the init function should have 5 dimensions of format [nGroup, nInput, nOutput, kH, kW], and kH should be equal to kW

class bigdl.nn.initialization_method.ConstInitMethod(value, bigdl_type='float')[source]

Bases: bigdl.nn.initialization_method.InitializationMethod

Initializer that generates tensors with certain constant double.

class bigdl.nn.initialization_method.InitializationMethod(jvalue, bigdl_type, *args)[source]

Bases: bigdl.util.common.JavaValue

Initialization method to initialize bias and weight. The init method will be called in Module.reset()

class bigdl.nn.initialization_method.Ones(bigdl_type='float')[source]

Bases: bigdl.nn.initialization_method.InitializationMethod

Initializer that generates tensors with ones.

class bigdl.nn.initialization_method.RandomNormal(mean, stdv, bigdl_type='float')[source]

Bases: bigdl.nn.initialization_method.InitializationMethod

Initializer that generates tensors with a normal distribution.

class bigdl.nn.initialization_method.RandomUniform(upper=None, lower=None, bigdl_type='float')[source]

Bases: bigdl.nn.initialization_method.InitializationMethod

Initializer that generates tensors with a uniform distribution. It draws samples from a uniform distribution within [lower, upper] If lower and upper is not specified, it draws samples form a uniform distribution within [-limit, limit] where “limit” is “1/sqrt(fan_in)”

class bigdl.nn.initialization_method.Xavier(bigdl_type='float')[source]

Bases: bigdl.nn.initialization_method.InitializationMethod

Xavier Initializer. See http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf

class bigdl.nn.initialization_method.Zeros(bigdl_type='float')[source]

Bases: bigdl.nn.initialization_method.InitializationMethod

Initializer that generates tensors with zeros.

bigdl.nn.layer module

class bigdl.nn.layer.Abs(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

an element-wise abs operation

>>> abs = Abs()
creating: createAbs
class bigdl.nn.layer.Add(input_size, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

adds a bias term to input data ;

Parameters:input_size – size of input data
>>> add = Add(1)
creating: createAdd
set_init_method(weight_init_method=None, bias_init_method=None)[source]
class bigdl.nn.layer.AddConstant(constant_scalar, inplace=False, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

adding a constant

Parameters:
  • constant_scalar – constant value
  • inplace – Can optionally do its operation in-place without using extra state memory
>>> addConstant = AddConstant(1e-5, True)
creating: createAddConstant
class bigdl.nn.layer.BatchNormalization(n_output, eps=1e-05, momentum=0.1, affine=True, init_weight=None, init_bias=None, init_grad_weight=None, init_grad_bias=None, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

This layer implements Batch Normalization as described in the paper: “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift” by Sergey Ioffe, Christian Szegedy https://arxiv.org/abs/1502.03167

This implementation is useful for inputs NOT coming from convolution layers. For convolution layers, use nn.SpatialBatchNormalization.

The operation implemented is:

       ( x - mean(x) )
y = -------------------- * gamma + beta
    standard-deviation(x)

where gamma and beta are learnable parameters.The learning of gamma and beta is optional.

Parameters:
  • n_output – output feature map number
  • eps – avoid divide zero
  • momentum – momentum for weight update
  • affine – affine operation on output or not
>>> batchNormalization = BatchNormalization(1, 1e-5, 1e-5, True)
creating: createBatchNormalization
>>> import numpy as np
>>> init_weight = np.random.randn(2)
>>> init_grad_weight = np.zeros([2])
>>> init_bias = np.zeros([2])
>>> init_grad_bias = np.zeros([2])
>>> batchNormalization = BatchNormalization(2, 1e-5, 1e-5, True, init_weight, init_bias, init_grad_weight, init_grad_bias)
creating: createBatchNormalization
set_init_method(weight_init_method=None, bias_init_method=None)[source]
class bigdl.nn.layer.BiRecurrent(merge=None, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Container

Create a Bidirectional recurrent layer

Parameters:merge – merge layer
>>> biRecurrent = BiRecurrent(CAddTable())
creating: createCAddTable
creating: createBiRecurrent
>>> biRecurrent = BiRecurrent()
creating: createBiRecurrent
class bigdl.nn.layer.BifurcateSplitTable(dimension, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Creates a module that takes a Tensor as input and outputs two tables, splitting the Tensor along the specified dimension dimension.

The input to this layer is expected to be a tensor, or a batch of tensors;

:param dimension to be split along this dimension :param T Numeric type. Only support float/double now

>>> bifurcateSplitTable = BifurcateSplitTable(1)
creating: createBifurcateSplitTable
class bigdl.nn.layer.Bilinear(input_size1, input_size2, output_size, bias_res=True, wRegularizer=None, bRegularizer=None, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

a bilinear transformation with sparse inputs, The input tensor given in forward(input) is a table containing both inputs x_1 and x_2, which are tensors of size N x inputDimension1 and N x inputDimension2, respectively.

:param input_size1 input dimension of x_1 :param input_size2 input dimension of x_2 :param output_size output dimension :param bias_res whether use bias :param wRegularizer: instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices. :param bRegularizer: instance of [[Regularizer]]applied to the bias.

>>> bilinear = Bilinear(1, 1, 1, True, L1Regularizer(0.5))
creating: createL1Regularizer
creating: createBilinear
set_init_method(weight_init_method=None, bias_init_method=None)[source]
class bigdl.nn.layer.BinaryTreeLSTM(input_size, hidden_size, gate_output=True, with_graph=True, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

This class is an implementation of Binary TreeLSTM (Constituency Tree LSTM). :param inputSize input units size :param hiddenSize hidden units size :param gateOutput whether gate output :param withGraph whether create lstms with [[Graph]], the default value is true. >>> treeLSTM = BinaryTreeLSTM(100, 200) creating: createBinaryTreeLSTM

class bigdl.nn.layer.Bottle(module, n_input_dim=2, n_output_dim1=2147483647, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Container

Bottle allows varying dimensionality input to be forwarded through any module that accepts input of nInputDim dimensions, and generates output of nOutputDim dimensions.

Parameters:
  • module – transform module
  • n_input_dim – nInputDim dimensions of module
  • n_output_dim1 – output of nOutputDim dimensions
>>> bottle = Bottle(Linear(100,10), 1, 1)
creating: createLinear
creating: createBottle
class bigdl.nn.layer.CAdd(size, bRegularizer=None, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

This layer has a bias tensor with given size. The bias will be added element wise to the input tensor. If the element number of the bias tensor match the input tensor, a simply element wise will be done. Or the bias will be expanded to the same size of the input. The expand means repeat on unmatched singleton dimension(if some unmatched dimension isn’t singleton dimension, it will report an error). If the input is a batch, a singleton dimension will be add to the first dimension before the expand.

Parameters:
  • size – the size of the bias
  • bRegularizer – instance of [[Regularizer]]applied to the bias.
>>> cAdd = CAdd([1,2])
creating: createCAdd
set_init_method(weight_init_method=None, bias_init_method=None)[source]
class bigdl.nn.layer.CAddTable(inplace=False, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Merge the input tensors in the input table by element wise adding them together. The input table is actually an array of tensor with same size.

Parameters:inplace – reuse the input memory
>>> cAddTable = CAddTable(True)
creating: createCAddTable
class bigdl.nn.layer.CDivTable(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Takes a table with two Tensor and returns the component-wise division between them.

>>> cDivTable = CDivTable()
creating: createCDivTable
class bigdl.nn.layer.CMaxTable(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Takes a table of Tensors and outputs the max of all of them.

>>> cMaxTable = CMaxTable()
creating: createCMaxTable
class bigdl.nn.layer.CMinTable(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Takes a table of Tensors and outputs the min of all of them.

>>> cMinTable = CMinTable()
creating: createCMinTable
class bigdl.nn.layer.CMul(size, wRegularizer=None, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies a component-wise multiplication to the incoming data

Parameters:size – size of the data
>>> cMul = CMul([1,2])
creating: createCMul
set_init_method(weight_init_method=None, bias_init_method=None)[source]
class bigdl.nn.layer.CMulTable(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Takes a table of Tensors and outputs the multiplication of all of them.

>>> cMulTable = CMulTable()
creating: createCMulTable
class bigdl.nn.layer.CSubTable(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Takes a table with two Tensor and returns the component-wise subtraction between them.

>>> cSubTable = CSubTable()
creating: createCSubTable
class bigdl.nn.layer.Clamp(min, max, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Clamps all elements into the range [min_value, max_value]. Output is identical to input in the range, otherwise elements less than min_value (or greater than max_value) are saturated to min_value (or max_value).

Parameters:
  • min
  • max
>>> clamp = Clamp(1, 3)
creating: createClamp
class bigdl.nn.layer.Concat(dimension, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Container

Concat concatenates the output of one layer of “parallel” modules along the provided {@code dimension}: they take the same inputs, and their output is concatenated.

                +-----------+
           +---->  module1  -----+
           |    |           |    |
input -----+---->  module2  -----+----> output
           |    |           |    |
           +---->  module3  -----+
                +-----------+
Parameters:dimension – dimension
>>> concat = Concat(2)
creating: createConcat
class bigdl.nn.layer.ConcatTable(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Container

ConcateTable is a container module like Concate. Applies an input to each member module, input can be a tensor or a table.

ConcateTable usually works with CAddTable and CMulTable to implement element wise add/multiply on outputs of two modules.

>>> concatTable = ConcatTable()
creating: createConcatTable
class bigdl.nn.layer.Container(jvalue, bigdl_type, *args)[source]

Bases: bigdl.nn.layer.Layer

[[Container]] is a sub-class of Model that declares methods defined in all containers. A container usually contain some other modules which can be added through the “add” method

add(model)[source]
class bigdl.nn.layer.Contiguous(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

used to make input, grad_output both contiguous

>>> contiguous = Contiguous()
creating: createContiguous
class bigdl.nn.layer.ConvLSTMPeephole(input_size, output_size, kernel_i, kernel_c, stride, wRegularizer=None, uRegularizer=None, bRegularizer=None, cRegularizer=None, with_peephole=True, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Convolution Long Short Term Memory architecture with peephole.
Ref. A.: https://arxiv.org/abs/1506.04214 (blueprint for this module)
Parameters:
  • input_size – number of input planes in the image given into forward()
  • output_size – number of output planes the convolution layer will produce

:param kernel_i Convolutional filter size to convolve input :param kernel_c Convolutional filter size to convolve cell :param stride The step of the convolution :param wRegularizer: instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices :param uRegularizer: instance [[Regularizer]](eg. L1 or L2 regularization), applied to the recurrent weights matrices :param bRegularizer: instance of [[Regularizer]]applied to the bias. :param cRegularizer: instance of [[Regularizer]]applied to peephole. :param with_peephole: whether use last cell status control a gate.

>>> convlstm = ConvLSTMPeephole(4, 3, 3, 3, 1, L1Regularizer(0.5), L1Regularizer(0.5), L1Regularizer(0.5), L1Regularizer(0.5))
creating: createL1Regularizer
creating: createL1Regularizer
creating: createL1Regularizer
creating: createL1Regularizer
creating: createConvLSTMPeephole
class bigdl.nn.layer.ConvLSTMPeephole3D(input_size, output_size, kernel_i, kernel_c, stride, wRegularizer=None, uRegularizer=None, bRegularizer=None, cRegularizer=None, with_peephole=True, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Parameters:
  • input_size – number of input planes in the image given into forward()
  • output_size – number of output planes the convolution layer will produce

:param kernel_i Convolutional filter size to convolve input :param kernel_c Convolutional filter size to convolve cell :param stride The step of the convolution :param wRegularizer: instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices :param uRegularizer: instance [[Regularizer]](eg. L1 or L2 regularization), applied to the recurrent weights matrices :param bRegularizer: instance of [[Regularizer]]applied to the bias. :param cRegularizer: instance of [[Regularizer]]applied to peephole. :param with_peephole: whether use last cell status control a gate.

>>> convlstm = ConvLSTMPeephole3D(4, 3, 3, 3, 1, L1Regularizer(0.5), L1Regularizer(0.5), L1Regularizer(0.5), L1Regularizer(0.5))
creating: createL1Regularizer
creating: createL1Regularizer
creating: createL1Regularizer
creating: createL1Regularizer
creating: createConvLSTMPeephole3D
class bigdl.nn.layer.Cosine(input_size, output_size, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Cosine calculates the cosine similarity of the input to k mean centers. The input given in forward(input) must be either a vector (1D tensor) or matrix (2D tensor). If the input is a vector, it must have the size of inputSize. If it is a matrix, then each row is assumed to be an input sample of given batch (the number of rows means the batch size and the number of columns should be equal to the inputSize).

Parameters:
  • input_size – the size of each input sample
  • output_size – the size of the module output of each sample
>>> cosine = Cosine(2,3)
creating: createCosine
set_init_method(weight_init_method=None, bias_init_method=None)[source]
class bigdl.nn.layer.CosineDistance(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Outputs the cosine distance between inputs

>>> cosineDistance = CosineDistance()
creating: createCosineDistance
class bigdl.nn.layer.DenseToSparse(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Convert DenseTensor to SparseTensor.

>>> DenseToSparse = DenseToSparse()
creating: createDenseToSparse
class bigdl.nn.layer.DotProduct(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

This is a simple table layer which takes a table of two tensors as input and calculate the dot product between them as outputs

>>> dotProduct = DotProduct()
creating: createDotProduct
class bigdl.nn.layer.Dropout(init_p=0.5, inplace=False, scale=True, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Dropout masks(set to zero) parts of input using a bernoulli distribution. Each input element has a probability initP of being dropped. If scale is set, the outputs are scaled by a factor of 1/(1-initP) during training. During evaluating, output is the same as input.

Parameters:
  • initP – probability to be dropped
  • inplace – inplace model
  • scale – if scale by a factor of 1/(1-initP)
>>> dropout = Dropout(0.4)
creating: createDropout
class bigdl.nn.layer.ELU(alpha=1.0, inplace=False, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

D-A Clevert, Thomas Unterthiner, Sepp Hochreiter Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) [http://arxiv.org/pdf/1511.07289.pdf]

>>> eLU = ELU(1e-5, True)
creating: createELU
class bigdl.nn.layer.Echo(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

This module is for debug purpose, which can print activation and gradient in your model topology

>>> echo = Echo()
creating: createEcho
class bigdl.nn.layer.Euclidean(input_size, output_size, fast_backward=True, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Outputs the Euclidean distance of the input to outputSize centers

Parameters:
  • inputSize – inputSize
  • outputSize – outputSize
  • T – Numeric type. Only support float/double now
>>> euclidean = Euclidean(1, 1, True)
creating: createEuclidean
set_init_method(weight_init_method=None, bias_init_method=None)[source]
class bigdl.nn.layer.Exp(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies element-wise exp to input tensor.

>>> exp = Exp()
creating: createExp
class bigdl.nn.layer.FlattenTable(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

This is a table layer which takes an arbitrarily deep table of Tensors (potentially nested) as input and a table of Tensors without any nested table will be produced

>>> flattenTable = FlattenTable()
creating: createFlattenTable
class bigdl.nn.layer.GRU(input_size, hidden_size, p=0.0, wRegularizer=None, uRegularizer=None, bRegularizer=None, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Gated Recurrent Units architecture. The first input in sequence uses zero value for cell and hidden state

Parameters:
  • input_size – the size of each input vector
  • hidden_size – Hidden unit size in GRU
  • p – is used for [[Dropout]] probability. For more details aboutRNN dropouts, please refer to[RnnDrop: A Novel Dropout for RNNs in ASR](http://www.stat.berkeley.edu/~tsmoon/files/Conference/asru2015.pdf)[A Theoretically Grounded Application of Dropout in Recurrent Neural Networks](https://arxiv.org/pdf/1512.05287.pdf)
  • wRegularizer – instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices.
  • uRegularizer – instance [[Regularizer]](eg. L1 or L2 regularization), applied to the recurrent weights matrices.
  • bRegularizer – instance of [[Regularizer]]applied to the bias.
>>> gru = GRU(4, 3, 0.5, L1Regularizer(0.5), L1Regularizer(0.5), L1Regularizer(0.5))
creating: createL1Regularizer
creating: createL1Regularizer
creating: createL1Regularizer
creating: createGRU
class bigdl.nn.layer.GaussianSampler(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Takes {mean, log_variance} as input and samples from the Gaussian distribution >>> sampler = GaussianSampler() creating: createGaussianSampler

class bigdl.nn.layer.GradientReversal(the_lambda=1.0, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

It is a simple module preserves the input, but takes the gradient from the subsequent layer, multiplies it by -lambda and passes it to the preceding layer. This can be used to maximise an objective function whilst using gradient descent, as described in [“Domain-Adversarial Training of Neural Networks” (http://arxiv.org/abs/1505.07818)]

Parameters:lambda – hyper-parameter lambda can be set dynamically during training
>>> gradientReversal = GradientReversal(1e-5)
creating: createGradientReversal
>>> gradientReversal = GradientReversal()
creating: createGradientReversal
class bigdl.nn.layer.HardShrink(the_lambda=0.5, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

This is a transfer layer which applies the hard shrinkage function element-wise to the input Tensor. The parameter lambda is set to 0.5 by default

        x, if x >  lambda
f(x) =  x, if x < -lambda
        0, otherwise
Parameters:the_lambda – a threshold value whose default value is 0.5
>>> hardShrink = HardShrink(1e-5)
creating: createHardShrink
class bigdl.nn.layer.HardTanh(min_value=-1.0, max_value=1.0, inplace=False, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies HardTanh to each element of input, HardTanh is defined:

       |  maxValue, if x > maxValue
f(x) = |  minValue, if x < minValue
       |  x, otherwise
Parameters:
  • min_value – minValue in f(x), default is -1.
  • max_value – maxValue in f(x), default is 1.
  • inplace – whether enable inplace model.
>>> hardTanh = HardTanh(1e-5, 1e5, True)
creating: createHardTanh
>>> hardTanh = HardTanh()
creating: createHardTanh
class bigdl.nn.layer.Identity(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Identity just return the input to output. It’s useful in same parallel container to get an origin input.

>>> identity = Identity()
creating: createIdentity
class bigdl.nn.layer.Index(dimension, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies the Tensor index operation along the given dimension.

Parameters:dimension – the dimension to be indexed
>>> index = Index(1)
creating: createIndex
class bigdl.nn.layer.InferReshape(size, batch_mode=False, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Reshape the input tensor with automatic size inference support. Positive numbers in the size argument are used to reshape the input to the corresponding dimension size. There are also two special values allowed in size: a. 0 means keep the corresponding dimension size of the input unchanged. i.e., if the 1st dimension size of the input is 2, the 1st dimension size of output will be set as 2 as well. b. -1 means infer this dimension size from other dimensions. This dimension size is calculated by keeping the amount of output elements consistent with the input. Only one -1 is allowable in size.

For example, Input tensor with size: (4, 5, 6, 7) -> InferReshape(Array(4, 0, 3, -1)) Output tensor with size: (4, 5, 3, 14) The 1st and 3rd dim are set to given sizes, keep the 2nd dim unchanged, and inferred the last dim as 14.

Parameters:
  • size – the target tensor size
  • batch_mode – whether in batch mode
>>> inferReshape = InferReshape([4, 0, 3, -1], False)
creating: createInferReshape
class bigdl.nn.layer.Input(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Node

Input layer do nothing to the input tensors, just passing them through. It is used as input to the Graph container (add a link) when the first layer of the graph container accepts multiple tensors as inputs.

Each input node of the graph container should accept one tensor as input. If you want a module accepting multiple tensors as input, you should add some Input module before it and connect the outputs of the Input nodes to it.

Please note that the return is not a layer but a Node containing input layer.

>>> input = Input()
creating: createInput
class bigdl.nn.layer.JoinTable(dimension, n_input_dims, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

It is a table module which takes a table of Tensors as input and outputs a Tensor by joining them together along the dimension dimension.

The input to this layer is expected to be a tensor, or a batch of tensors; when using mini-batch, a batch of sample tensors will be passed to the layer and the user need to specify the number of dimensions of each sample tensor in the batch using nInputDims.

Parameters:
  • dimension – to be join in this dimension
  • nInputDims – specify the number of dimensions that this module will receiveIf it is more than the dimension of input tensors, the first dimensionwould be considered as batch size
>>> joinTable = JoinTable(1, 1)
creating: createJoinTable
class bigdl.nn.layer.L1Penalty(l1weight, size_average=False, provide_output=True, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

adds an L1 penalty to an input (for sparsity). L1Penalty is an inline module that in its forward propagation copies the input Tensor directly to the output, and computes an L1 loss of the latent state (input) and stores it in the module’s loss field. During backward propagation: gradInput = gradOutput + gradLoss.

Parameters:
  • l1weight
  • sizeAverage
  • provideOutput
>>> l1Penalty = L1Penalty(1, True, True)
creating: createL1Penalty
class bigdl.nn.layer.LSTM(input_size, hidden_size, p=0.0, wRegularizer=None, uRegularizer=None, bRegularizer=None, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Parameters:
  • inputSize – the size of each input vector
  • hiddenSize – Hidden unit size in the LSTM
  • p – is used for [[Dropout]] probability. For more details aboutRNN dropouts, please refer to[RnnDrop: A Novel Dropout for RNNs in ASR](http://www.stat.berkeley.edu/~tsmoon/files/Conference/asru2015.pdf)[A Theoretically Grounded Application of Dropout in Recurrent Neural Networks](https://arxiv.org/pdf/1512.05287.pdf)
  • wRegularizer – instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices.
  • uRegularizer – instance [[Regularizer]](eg. L1 or L2 regularization), applied to the recurrent weights matrices.
  • bRegularizer – instance of [[Regularizer]]applied to the bias.
>>> lstm = LSTM(4, 3, 0.5, L1Regularizer(0.5), L1Regularizer(0.5), L1Regularizer(0.5))
creating: createL1Regularizer
creating: createL1Regularizer
creating: createL1Regularizer
creating: createLSTM
class bigdl.nn.layer.LSTMPeephole(input_size=4, hidden_size=3, p=0.0, wRegularizer=None, uRegularizer=None, bRegularizer=None, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Parameters:
  • input_size – the size of each input vector
  • hidden_size – Hidden unit size in the LSTM
  • p – is used for [[Dropout]] probability. For more details aboutRNN dropouts, please refer to[RnnDrop: A Novel Dropout for RNNs in ASR](http://www.stat.berkeley.edu/~tsmoon/files/Conference/asru2015.pdf)[A Theoretically Grounded Application of Dropout in Recurrent Neural Networks](https://arxiv.org/pdf/1512.05287.pdf)
  • wRegularizer – instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices.
  • uRegularizer – instance [[Regularizer]](eg. L1 or L2 regularization), applied to the recurrent weights matrices.
  • bRegularizer – instance of [[Regularizer]]applied to the bias.
>>> lstm = LSTMPeephole(4, 3, 0.5, L1Regularizer(0.5), L1Regularizer(0.5), L1Regularizer(0.5))
creating: createL1Regularizer
creating: createL1Regularizer
creating: createL1Regularizer
creating: createLSTMPeephole
class bigdl.nn.layer.Layer(jvalue, bigdl_type, *args)[source]

Bases: bigdl.util.common.JavaValue

Layer is the basic component of a neural network and it’s also the base class of layers. Layer can connect to others to construct a complex neural network.

backward(input, grad_output)[source]

NB: It’s for debug only, please use optimizer.optimize() in production. Performs a back-propagation step through the module, with respect to the given input. In general this method makes the assumption forward(input) has been called before, with the same input. This is necessary for optimization reasons. If you do not respect this rule, backward() will compute incorrect gradients.

Parameters:
  • input – ndarray or list of ndarray or JTensor or list of JTensor.
  • grad_output – ndarray or list of ndarray or JTensor or list of JTensor.
Returns:

ndarray or list of ndarray

static check_input(input)[source]
Parameters:input – ndarray or list of ndarray or JTensor or list of JTensor.
Returns:(list of JTensor, isTable)
static convert_output(output)[source]
evaluate(*args)[source]

No argument passed in: Evaluate the model to set train = false, useful when doing test/forward :return: layer itself

Three arguments passed in: A method to benchmark the model quality.

Parameters:
  • val_rdd – the input data
  • batch_size – batch size
  • val_methods – a list of validation methods. i.e: Top1Accuracy,Top5Accuracy and Loss.
Returns:

forward(input)[source]

NB: It’s for debug only, please use optimizer.optimize() in production. Takes an input object, and computes the corresponding output of the module

Parameters:
  • input – ndarray or list of ndarray
  • input – ndarray or list of ndarray or JTensor or list of JTensor.
Returns:

ndarray or list of ndarray

freeze(names=None)[source]

freeze module, if names is not None, set an array of layers that match given names to be freezed :param names: an array of layer names :return:

get_dtype()[source]
get_weights()[source]

Get weights for this layer

Returns:list of numpy arrays which represent weight and bias
is_training()[source]
Returns:Whether this layer is in the training mode
>>> layer = Dropout()
creating: createDropout
>>> layer = layer.evaluate()
>>> layer.is_training()
False
>>> layer = layer.training()
>>> layer.is_training()
True
name()[source]

Name of this layer

classmethod of(jvalue, bigdl_type='float')[source]

Create a Python Layer base on the given java value :param jvalue: Java object create by Py4j :return: A Python Layer

parameters()[source]

Get the model parameters which containing: weight, bias, gradBias, gradWeight

Returns:dict(layername -> dict(parametername -> ndarray))
predict(data_rdd)[source]

Model inference base on the given data. You need to invoke collect() to trigger those action as the returning result is an RDD.

Parameters:data_rdd – the data to be predict.
Returns:An RDD represent the predict result.
predict_class(data_rdd)[source]

module predict, return the predict label

Parameters:data_rdd – the data to be predict.
Returns:An RDD represent the predict label.
quantize()[source]

Clone self and quantize it, at last return a new quantized model. :return: A new quantized model.

>>> fc = Linear(4, 2)
creating: createLinear
>>> fc.set_weights([np.ones((4, 2)), np.ones((2,))])
>>> input = np.ones((2, 4))
>>> fc.forward(input)
array([[ 5.,  5.],
[ 5.,  5.]], dtype=float32)
>>> quantized_fc = fc.quantize()
>>> quantized_fc.forward(input)
array([[ 5.,  5.],
[ 5.,  5.]], dtype=float32)
>>> assert("quantized.Linear" in quantized_fc.__str__())
>>> conv = SpatialConvolution(1, 2, 3, 3)
creating: createSpatialConvolution
>>> conv.set_weights([np.ones((2, 1, 3, 3)), np.zeros((2,))])
>>> input = np.ones((2, 1, 4, 4))
>>> conv.forward(input)
array([[[[ 9.,  9.],
[ 9.,  9.]],

[[ 9.,  9.],
[ 9.,  9.]]],


[[[ 9.,  9.],
[ 9.,  9.]],

[[ 9.,  9.],
[ 9.,  9.]]]], dtype=float32)
>>> quantized_conv = conv.quantize()
>>> quantized_conv.forward(input)
array([[[[ 9.,  9.],
[ 9.,  9.]],

[[ 9.,  9.],
[ 9.,  9.]]],


[[[ 9.,  9.],
[ 9.,  9.]],

[[ 9.,  9.],
[ 9.,  9.]]]], dtype=float32)
>>> assert("quantized.SpatialConvolution" in quantized_conv.__str__())
>>> seq = Sequential()
creating: createSequential
>>> seq = seq.add(conv)
>>> seq = seq.add(Reshape([8, 4], False))
creating: createReshape
>>> seq = seq.add(fc)
>>> input = np.ones([1, 1, 6, 6])
>>> seq.forward(input)
array([[ 37.,  37.],
[ 37.,  37.],
[ 37.,  37.],
[ 37.,  37.],
[ 37.,  37.],
[ 37.,  37.],
[ 37.,  37.],
[ 37.,  37.]], dtype=float32)
>>> quantized_seq = seq.quantize()
>>> quantized_seq.forward(input)
array([[ 37.,  37.],
[ 37.,  37.],
[ 37.,  37.],
[ 37.,  37.],
[ 37.,  37.],
[ 37.,  37.],
[ 37.,  37.],
[ 37.,  37.]], dtype=float32)
>>> assert("quantized.Linear" in quantized_seq.__str__())
>>> assert("quantized.SpatialConvolution" in quantized_seq.__str__())
reset()[source]

Initialize the model weights.

save(path, over_write=False)[source]
saveModel(path, over_write=False)[source]
save_caffe(prototxt_path, model_path, use_v2=True, overwrite=False)[source]
save_tensorflow(inputs, path, byte_order='little_endian', data_format='nhwc')[source]

Save a model to protobuf files so that it can be used in tensorflow inference.

When saving the model, placeholders will be added to the tf model as input nodes. So you need to pass in the names and shapes of the placeholders. BigDL model doesn’t have such information. The order of the placeholder information should be same as the inputs of the graph model. :param inputs: placeholder information, should be an array of tuples (input_name, shape) where ‘input_name’ is a string and shape is an array of integer :param path: the path to be saved to :param byte_order: model byte order :param data_format: model data format, should be “nhwc” or “nchw”

setBRegularizer(bRegularizer)[source]

set bias regularizer :param wRegularizer: bias regularizer :return:

setWRegularizer(wRegularizer)[source]

set weight regularizer :param wRegularizer: weight regularizer :return:

set_name(name)[source]

Give this model a name. There would be a generated name consist of class name and UUID if user doesn’t set it.

set_seed(seed=123)[source]

You can control the random seed which used to init weights for this model.

Parameters:seed – random seed
Returns:Model itself.
set_weights(weights)[source]

Set weights for this layer

Parameters:weights – a list of numpy arrays which represent weight and bias
Returns:
>>> linear = Linear(3,2)
creating: createLinear
>>> linear.set_weights([np.array([[1,2,3],[4,5,6]]), np.array([7,8])])
>>> weights = linear.get_weights()
>>> weights[0].shape == (2,3)
True
>>> weights[0][0]
array([ 1.,  2.,  3.], dtype=float32)
>>> weights[1]
array([ 7.,  8.], dtype=float32)
>>> relu = ReLU()
creating: createReLU
>>> from py4j.protocol import Py4JJavaError
>>> try:
...     relu.set_weights([np.array([[1,2,3],[4,5,6]]), np.array([7,8])])
... except Py4JJavaError as err:
...     print(err.java_exception)
...
java.lang.IllegalArgumentException: requirement failed: this layer does not have weight/bias
>>> relu.get_weights()
The layer does not have weight/bias
>>> add = Add(2)
creating: createAdd
>>> try:
...     add.set_weights([np.array([7,8]), np.array([1,2])])
... except Py4JJavaError as err:
...     print(err.java_exception)
...
java.lang.IllegalArgumentException: requirement failed: the number of input weight/bias is not consistant with number of weight/bias of this layer, number of input 1, number of output 2
>>> cAdd = CAdd([4, 1])
creating: createCAdd
>>> cAdd.set_weights(np.ones([4, 1]))
>>> (cAdd.get_weights()[0] == np.ones([4, 1])).all()
True
training()[source]

Set this layer in the training mode

unfreeze(names=None)[source]

unfreeze module, if names is not None, unfreeze layers that match given names :param names: an array of layer names :return:

update_parameters(learning_rate)[source]

NB: It’s for debug only, please use optimizer.optimize() in production.

zero_grad_parameters()[source]

NB: It’s for debug only, please use optimizer.optimize() in production. If the module has parameters, this will zero the accumulation of the gradients with respect to these parameters. Otherwise, it does nothing.

class bigdl.nn.layer.LeakyReLU(negval=0.01, inplace=False, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

It is a transfer module that applies LeakyReLU, which parameter negval sets the slope of the negative part: LeakyReLU is defined as: f(x) = max(0, x) + negval * min(0, x)

Parameters:
  • negval – sets the slope of the negative partl
  • inplace – if it is true, doing the operation in-place without using extra state memory
>>> leakyReLU = LeakyReLU(1e-5, True)
creating: createLeakyReLU
class bigdl.nn.layer.Linear(input_size, output_size, with_bias=True, wRegularizer=None, bRegularizer=None, init_weight=None, init_bias=None, init_grad_weight=None, init_grad_bias=None, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

The [[Linear]] module applies a linear transformation to the input data, i.e. y = Wx + b. The input given in forward(input) must be either a vector (1D tensor) or matrix (2D tensor). If the input is a vector, it must have the size of inputSize. If it is a matrix, then each row is assumed to be an input sample of given batch (the number of rows means the batch size and the number of columns should be equal to the inputSize).

:param input_size the size the each input sample :param output_size the size of the module output of each sample :param wRegularizer: instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices. :param bRegularizer: instance of [[Regularizer]]applied to the bias. :param init_weight: the optional initial value for the weight :param init_bias: the optional initial value for the bias :param init_grad_weight: the optional initial value for the grad_weight :param init_grad_bias: the optional initial value for the grad_bias

>>> linear = Linear(100, 10, True, L1Regularizer(0.5), L1Regularizer(0.5))
creating: createL1Regularizer
creating: createL1Regularizer
creating: createLinear
>>> import numpy as np
>>> init_weight = np.random.randn(10, 100)
>>> init_bias = np.random.randn(10)
>>> init_grad_weight = np.zeros([10, 100])
>>> init_grad_bias = np.zeros([10])
>>> linear = Linear(100, 10, True, L1Regularizer(0.5), L1Regularizer(0.5), init_weight, init_bias, init_grad_weight, init_grad_bias)
creating: createL1Regularizer
creating: createL1Regularizer
creating: createLinear
set_init_method(weight_init_method=None, bias_init_method=None)[source]
class bigdl.nn.layer.Log(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies the log function element-wise to the input Tensor, thus outputting a Tensor of the same dimension.

>>> log = Log()
creating: createLog
class bigdl.nn.layer.LogSigmoid(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

This class is a transform layer corresponding to the sigmoid function: f(x) = Log(1 / (1 + e ^^ (-x)))

>>> logSigmoid = LogSigmoid()
creating: createLogSigmoid
class bigdl.nn.layer.LogSoftMax(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies the LogSoftMax function to an n-dimensional input Tensor. LogSoftmax is defined as: f_i(x) = log(1 / a exp(x_i)) where a = sum_j[exp(x_j)].

>>> logSoftMax = LogSoftMax()
creating: createLogSoftMax
class bigdl.nn.layer.LookupTable(n_index, n_output, padding_value=0.0, max_norm=1.7976931348623157e+308, norm_type=2.0, should_scale_grad_by_freq=False, wRegularizer=None, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

a convolution of width 1, commonly used for word embeddings

Parameters:wRegularizer – instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices.
>>> lookupTable = LookupTable(1, 1, 1e-5, 1e-5, 1e-5, True, L1Regularizer(0.5))
creating: createL1Regularizer
creating: createLookupTable
set_init_method(weight_init_method=None, bias_init_method=None)[source]
class bigdl.nn.layer.MM(trans_a=False, trans_b=False, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Module to perform matrix multiplication on two mini-batch inputs, producing a mini-batch.

Parameters:
  • trans_a – specifying whether or not transpose the first input matrix
  • trans_b – specifying whether or not transpose the second input matrix
>>> mM = MM(True, True)
creating: createMM
class bigdl.nn.layer.MV(trans=False, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

It is a module to perform matrix vector multiplication on two mini-batch inputs, producing a mini-batch.

Parameters:trans – whether make matrix transpose before multiplication
>>> mV = MV(True)
creating: createMV
class bigdl.nn.layer.MapTable(module=None, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Container

This class is a container for a single module which will be applied to all input elements. The member module is cloned as necessary to process all input elements.

>>> mapTable = MapTable(Linear(100,10))
creating: createLinear
creating: createMapTable
class bigdl.nn.layer.MaskedSelect(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Performs a torch.MaskedSelect on a Tensor. The mask is supplied as a tabular argument with the input on the forward and backward passes.

>>> maskedSelect = MaskedSelect()
creating: createMaskedSelect
class bigdl.nn.layer.Max(dim, num_input_dims=-2147483648, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies a max operation over dimension dim

Parameters:
  • dim – max along this dimension
  • num_input_dims – Optional. If in a batch model, set to the inputDims.
>>> max = Max(1)
creating: createMax
class bigdl.nn.layer.Mean(dimension=1, n_input_dims=-1, squeeze=True, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

It is a simple layer which applies a mean operation over the given dimension. When nInputDims is provided, the input will be considered as batches. Then the mean operation will be applied in (dimension + 1). The input to this layer is expected to be a tensor, or a batch of tensors; when using mini-batch, a batch of sample tensors will be passed to the layer and the user need to specify the number of dimensions of each sample tensor in the batch using nInputDims.

Parameters:
  • dimension – the dimension to be applied mean operation
  • n_input_dims – specify the number of dimensions that this module will receiveIf it is more than the dimension of input tensors, the first dimension would be consideredas batch size
  • squeeze – default is true, which will squeeze the sum dimension; set it to false to keep the sum dimension
>>> mean = Mean(1, 1, True)
creating: createMean
class bigdl.nn.layer.Min(dim=1, num_input_dims=-2147483648, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies a min operation over dimension dim.

Parameters:
  • dim – min along this dimension
  • num_input_dims – Optional. If in a batch model, set to the input_dim.
>>> min = Min(1)
creating: createMin
class bigdl.nn.layer.MixtureTable(dim=2147483647, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Creates a module that takes a table {gater, experts} as input and outputs the mixture of experts (a Tensor or table of Tensors) using a gater Tensor. When dim is provided, it specifies the dimension of the experts Tensor that will be interpolated (or mixed). Otherwise, the experts should take the form of a table of Tensors. This Module works for experts of dimension 1D or more, and for a 1D or 2D gater, i.e. for single examples or mini-batches.

>>> mixtureTable = MixtureTable()
creating: createMixtureTable
>>> mixtureTable = MixtureTable(10)
creating: createMixtureTable
class bigdl.nn.layer.Model(inputs, outputs, bigdl_type='float', byte_order='little_endian', model_type='bigdl')[source]

Bases: bigdl.nn.layer.Container

A graph container. Each node can have multiple inputs. The output of the node should be a tensor. The output tensor can be connected to multiple nodes. So the module in each node can have a tensor or table input, and should have a tensor output.

The graph container can have multiple inputs and multiple outputs. If there’s one input, the input data fed to the graph module should be a tensor. If there’re multiple inputs, the input data fed to the graph module should be a table, which is actually an sequence of tensor. The order of the input tensors should be same with the order of the input nodes. This is also applied to the gradient from the module in the back propagation.

If there’s one output, the module output is a tensor. If there’re multiple outputs, the module output is a table, which is actually an sequence of tensor. The order of the output tensors is same with the order of the output modules. This is also applied to the gradient passed to the module in the back propagation.

All inputs should be able to connect to outputs through some paths in the graph. It is allowed that some successors of the inputs node are not connect to outputs. If so, these nodes will be excluded in the computation.

We also support initializing a Graph directly from a tensorflow module. In this case, you should pass your tensorflow nodes as inputs and outputs and also specify the byte_order parameter (“little_endian” or “big_endian”) and node_type parameter (“bigdl” or “tensorflow”) node_type parameter.

static load(path, bigdl_type='float')[source]

Load a pre-trained Bigdl model.

Parameters:path – The path containing the pre-trained model.
Returns:A pre-trained model.
static loadModel(path, bigdl_type='float')[source]

Load a pre-trained Bigdl model.

Parameters:path – The path containing the pre-trained model.
Returns:A pre-trained model.
static load_caffe(model, defPath, modelPath, match_all=True, bigdl_type='float')[source]

Load a pre-trained Caffe model.

Parameters:
  • model – A bigdl model definition which equivalent to the pre-trained caffe model.
  • defPath – The path containing the caffe model definition.
  • modelPath – The path containing the pre-trained caffe model.
Returns:

A pre-trained model.

static load_caffe_model(defPath, modelPath, bigdl_type='float')[source]

Load a pre-trained Caffe model.

Parameters:
  • defPath – The path containing the caffe model definition.
  • modelPath – The path containing the pre-trained caffe model.
Returns:

A pre-trained model.

static load_tensorflow(path, inputs, outputs, byte_order='little_endian', bigdl_type='float')[source]

Load a pre-trained Tensorflow model. :param path: The path containing the pre-trained model. :return: A pre-trained model.

static load_torch(path, bigdl_type='float')[source]

Load a pre-trained Torch model.

Parameters:path – The path containing the pre-trained model.
Returns:A pre-trained model.
save_graph_topology(log_path, bigdl_type='float')[source]

save current model graph to a folder, which can be display in tensorboard by running tensorboard –logdir logPath :param log_path: path to save the model graph :param bigdl_type: :return:

stop_gradient(stop_layers, bigdl_type='float')[source]

stop the input gradient of layers that match the given `names` their input gradient are not computed. And they will not contributed to the input gradient computation of layers that depend on them. :param stop_layers: an array of layer names :param bigdl_type: :return:

static train(output, data, label, opt_method, criterion, batch_size, end_when, session=None, bigdl_type='float')[source]
class bigdl.nn.layer.Mul(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Multiply a single scalar factor to the incoming data

>>> mul = Mul()
creating: createMul
set_init_method(weight_init_method=None, bias_init_method=None)[source]
class bigdl.nn.layer.MulConstant(scalar, inplace=False, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Multiplies input Tensor by a (non-learnable) scalar constant. This module is sometimes useful for debugging purposes.

Parameters:
  • scalar – scalar constant
  • inplace – Can optionally do its operation in-place without using extra state memory
>>> mulConstant = MulConstant(2.5)
creating: createMulConstant
class bigdl.nn.layer.Narrow(dimension, offset, length=1, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Narrow is application of narrow operation in a module. The module further supports a negative length in order to handle inputs with an unknown size.

>>> narrow = Narrow(1, 1, 1)
creating: createNarrow
class bigdl.nn.layer.NarrowTable(offset, length=1, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Creates a module that takes a table as input and outputs the subtable starting at index offset having length elements (defaults to 1 element). The elements can be either a table or a Tensor. If length is negative, it means selecting the elements from the offset to element which located at the abs(length) to the last element of the input.

Parameters:
  • offset – the start index of table
  • length – the length want to select
>>> narrowTable = NarrowTable(1, 1)
creating: createNarrowTable
class bigdl.nn.layer.Negative(inplace=False, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Create an Negative layer. Computing negative value of each element of input tensor

Parameters:inplace – if output tensor reuse input tensor storage. Default value is false
>>> negative = Negative(False)
creating: createNegative
class bigdl.nn.layer.Node(jvalue, bigdl_type, *args)[source]

Bases: bigdl.util.common.JavaValue

Represent a node in a graph. The connections between nodes are directed.

element()[source]
classmethod of(jvalue, bigdl_type='float')[source]
class bigdl.nn.layer.Normalize(p, eps=1e-10, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Normalizes the input Tensor to have unit L_p norm. The smoothing parameter eps prevents division by zero when the input contains all zero elements (default = 1e-10). p can be the max value of double

>>> normalize = Normalize(1e-5, 1e-5)
creating: createNormalize
class bigdl.nn.layer.PReLU(n_output_plane=0, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies parametric ReLU, which parameter varies the slope of the negative part.

PReLU: f(x) = max(0, x) + a * min(0, x)

nOutputPlane’s default value is 0, that means using PReLU in shared version and has only one parameters.

Notice: Please don’t use weight decay on this.

Parameters:n_output_plane – input map number. Default is 0.
>>> pReLU = PReLU(1)
creating: createPReLU
set_init_method(weight_init_method=None, bias_init_method=None)[source]
class bigdl.nn.layer.Pack(dimension, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Stacks a list of n-dimensional tensors into one (n+1)-dimensional tensor.

>>> layer = Pack(1)
creating: createPack
class bigdl.nn.layer.Padding(dim, pad, n_input_dim, value=0.0, n_index=1, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

This module adds pad units of padding to dimension dim of the input. If pad is negative, padding is added to the left, otherwise, it is added to the right of the dimension.

The input to this layer is expected to be a tensor, or a batch of tensors; when using mini-batch, a batch of sample tensors will be passed to the layer and the user need to specify the number of dimensions of each sample tensor in the batch using n_input_dim.

Parameters:
  • dim – the dimension to be applied padding operation
  • pad – num of the pad units
  • n_input_dim – specify the number of dimensions that this module will receiveIf it is more than the dimension of input tensors, the first dimensionwould be considered as batch size
  • value – padding value
>>> padding = Padding(1, 1, 1, 1e-5, 1)
creating: createPadding
class bigdl.nn.layer.PairwiseDistance(norm=2, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

It is a module that takes a table of two vectors as input and outputs the distance between them using the p-norm. The input given in forward(input) is a [[Table]] that contains two tensors which must be either a vector (1D tensor) or matrix (2D tensor). If the input is a vector, it must have the size of inputSize. If it is a matrix, then each row is assumed to be an input sample of the given batch (the number of rows means the batch size and the number of columns should be equal to the inputSize).

Parameters:norm – the norm of distance
>>> pairwiseDistance = PairwiseDistance(2)
creating: createPairwiseDistance
class bigdl.nn.layer.ParallelTable(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Container

It is a container module that applies the i-th member module to the i-th input, and outputs an output in the form of Table

>>> parallelTable = ParallelTable()
creating: createParallelTable
class bigdl.nn.layer.Power(power, scale=1.0, shift=0.0, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Apply an element-wise power operation with scale and shift. f(x) = (shift + scale * x)^power^

Parameters:
  • power – the exponent.
  • scale – Default is 1.
  • shift – Default is 0.
>>> power = Power(1e-5)
creating: createPower
class bigdl.nn.layer.RReLU(lower=0.125, upper=0.3333333333333333, inplace=False, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies the randomized leaky rectified linear unit (RReLU) element-wise to the input Tensor, thus outputting a Tensor of the same dimension. Informally the RReLU is also known as ‘insanity’ layer. RReLU is defined as:

f(x) = max(0,x) + a * min(0, x) where a ~ U(l, u).

In training mode negative inputs are multiplied by a factor drawn from a uniform random distribution U(l, u).

In evaluation mode a RReLU behaves like a LeakyReLU with a constant mean factor a = (l + u) / 2.

By default, l = 1/8 and u = 1/3. If l == u a RReLU effectively becomes a LeakyReLU.

Regardless of operating in in-place mode a RReLU will internally allocate an input-sized noise tensor to store random factors for negative inputs.

The backward() operation assumes that forward() has been called before.

For reference see [Empirical Evaluation of Rectified Activations in Convolutional Network]( http://arxiv.org/abs/1505.00853).

Parameters:
  • lower – lower boundary of uniform random distribution
  • upper – upper boundary of uniform random distribution
  • inplace – optionally do its operation in-place without using extra state memory
>>> rReLU = RReLU(1e-5, 1e5, True)
creating: createRReLU
class bigdl.nn.layer.ReLU(ip=False, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies the rectified linear unit (ReLU) function element-wise to the input Tensor, thus outputting a Tensor of the same dimension.

ReLU is defined as: f(x) = max(0, x) Can optionally do its operation in-place without using extra state memory

>>> relu = ReLU()
creating: createReLU
class bigdl.nn.layer.ReLU6(inplace=False, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Same as ReLU except that the rectifying function f(x) saturates at x = 6

Parameters:inplace – either True = in-place or False = keeping separate state
>>> reLU6 = ReLU6(True)
creating: createReLU6
class bigdl.nn.layer.Recurrent(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Container

Recurrent module is a container of rnn cells Different types of rnn cells can be added using add() function

>>> recurrent = Recurrent()
creating: createRecurrent
get_hidden_state()[source]

get hidden state and cell at last time step.

Returns:list of hidden state and cell
set_hidden_state(states)[source]

set hidden state and cell at first time step.

class bigdl.nn.layer.RecurrentDecoder(output_length, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Recurrent

RecurrentDecoder module is a container of rnn cells which used to make a prediction of the next timestep based on the prediction we made from the previous timestep. Input for RecurrentDecoder is dynamically composed during training. input at t(i) is output at t(i-1), input at t(0) is user input, and user input has to be batch x stepShape(shape of the input at a single time step).

Different types of rnn cells can be added using add() function.

>>> recurrent_decoder = RecurrentDecoder(output_length = 5)
creating: createRecurrentDecoder
class bigdl.nn.layer.Replicate(n_features, dim=1, n_dim=2147483647, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Replicate repeats input nFeatures times along its dim dimension. Notice: No memory copy, it set the stride along the dim-th dimension to zero.

Parameters:
  • n_features – replicate times.
  • dim – dimension to be replicated.
  • n_dim – specify the number of non-batch dimensions.
>>> replicate = Replicate(2)
creating: createReplicate
class bigdl.nn.layer.Reshape(size, batch_mode=None, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

The forward(input) reshape the input tensor into a size(0) * size(1) * … tensor, taking the elements row-wise.

Parameters:size – the reshape size
>>> reshape = Reshape([1, 28, 28])
creating: createReshape
>>> reshape = Reshape([1, 28, 28], False)
creating: createReshape
class bigdl.nn.layer.ResizeBilinear(output_height, output_width, align_corner, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Resize the input image with bilinear interpolation. The input image must be a float tensor with NHWC layout

Parameters:
  • output_height – output height
  • output_width – output width
  • align_corner – align corner or not
>>> resizeBilinear = ResizeBilinear(10, 20, False)
creating: createResizeBilinear
class bigdl.nn.layer.Reverse(dimension=1, is_inplace=False, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Reverse the input w.r.t given dimension. The input can be a Tensor or Table.

Parameters:dim
>>> reverse = Reverse()
creating: createReverse
>>> reverse = Reverse(1, False)
creating: createReverse
class bigdl.nn.layer.RnnCell(input_size, hidden_size, activation, isInputWithBias=True, isHiddenWithBias=True, wRegularizer=None, uRegularizer=None, bRegularizer=None, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

It is a simple RNN. User can pass an activation function to the RNN.

Parameters:
  • input_size – the size of each input vector
  • hidden_size – Hidden unit size in simple RNN
  • activation – activation function
  • isInputWithBias – boolean
  • isHiddenWithBias – boolean
  • wRegularizer – instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices.
  • uRegularizer – instance [[Regularizer]](eg. L1 or L2 regularization), applied to the recurrent weights matrices.
  • bRegularizer – instance of [[Regularizer]](../regularizers.md),applied to the bias.
>>> reshape = RnnCell(4, 3, Tanh(), True, True, L1Regularizer(0.5), L1Regularizer(0.5), L1Regularizer(0.5))
creating: createTanh
creating: createL1Regularizer
creating: createL1Regularizer
creating: createL1Regularizer
creating: createRnnCell
class bigdl.nn.layer.RoiPooling(pooled_w, pooled_h, spatial_scale, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Region of interest pooling The RoIPooling uses max pooling to convert the features inside any valid region of interest into a small feature map with a fixed spatial extent of pooledH * pooledW (e.g., 7 * 7) an RoI is a rectangular window into a conv feature map. Each RoI is defined by a four-tuple (x1, y1, x2, y2) that specifies its top-left corner (x1, y1) and its bottom-right corner (x2, y2). RoI max pooling works by dividing the h * w RoI window into an pooledH * pooledW grid of sub-windows of approximate size h/H * w/W and then max-pooling the values in each sub-window into the corresponding output grid cell. Pooling is applied independently to each feature map channel

Parameters:
  • pooled_w – spatial extent in width
  • pooled_h – spatial extent in height
  • spatial_scale – spatial scale
>>> import numpy as np
>>> input_data = np.random.rand(2,2,6,8)
>>> input_rois = np.array([0, 0, 0, 7, 5, 1, 6, 2, 7, 5, 1, 3, 1, 6, 4, 0, 3, 3, 3, 3],dtype='float64').reshape(4,5)
>>> m = RoiPooling(3,2,1.0)
creating: createRoiPooling
>>> out = m.forward([input_data,input_rois])
class bigdl.nn.layer.Scale(size, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Scale is the combination of CMul and CAdd Computes the elementwise product of input and weight, with the shape of the weight “expand” to match the shape of the input. Similarly, perform a expand cdd bias and perform an elementwise add

Parameters:size – size of weight and bias
>>> scale = Scale([1,2])
creating: createScale
class bigdl.nn.layer.Select(dim, index, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

A Simple layer selecting an index of the input tensor in the given dimension

Parameters:
  • dimension – the dimension to select
  • index – the index of the dimension to be selected
>>> select = Select(1, 1)
creating: createSelect
class bigdl.nn.layer.SelectTable(index, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Creates a module that takes a table as input and outputs the element at index index (positive or negative). This can be either a table or a Tensor. The gradients of the non-index elements are zeroed Tensors of the same size. This is true regardless of the depth of the encapsulated Tensor as the function used internally to do so is recursive.

Parameters:index – the index to be selected
>>> selectTable = SelectTable(1)
creating: createSelectTable
class bigdl.nn.layer.Sequential(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Container

Sequential provides a means to plug layers together in a feed-forward fully connected manner.

>>> echo = Echo()
creating: createEcho
>>> s = Sequential()
creating: createSequential
>>> s = s.add(echo)
>>> s = s.add(s)
>>> s = s.add(echo)
class bigdl.nn.layer.Sigmoid(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies the Sigmoid function element-wise to the input Tensor, thus outputting a Tensor of the same dimension.

>>> sigmoid = Sigmoid()
creating: createSigmoid
class bigdl.nn.layer.SoftMax(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies the SoftMax function to an n-dimensional input Tensor, rescaling them so that the elements of the n-dimensional output Tensor lie in the range (0, 1) and sum to 1. Softmax is defined as: f_i(x) = exp(x_i - shift) / sum_j exp(x_j - shift) where shift = max_i(x_i).

>>> softMax = SoftMax()
creating: createSoftMax
class bigdl.nn.layer.SoftMin(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies the SoftMin function to an n-dimensional input Tensor, rescaling them so that the elements of the n-dimensional output Tensor lie in the range (0,1) and sum to 1. Softmin is defined as: f_i(x) = exp(-x_i - shift) / sum_j exp(-x_j - shift) where shift = max_i(-x_i).

>>> softMin = SoftMin()
creating: createSoftMin
class bigdl.nn.layer.SoftPlus(beta=1.0, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Apply the SoftPlus function to an n-dimensional input tensor. SoftPlus function: f_i(x) = 1/beta * log(1 + exp(beta * x_i))

Parameters:beta – Controls sharpness of transfer function
>>> softPlus = SoftPlus(1e-5)
creating: createSoftPlus
class bigdl.nn.layer.SoftShrink(the_lambda=0.5, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Apply the soft shrinkage function element-wise to the input Tensor

SoftShrinkage operator:

       | x - lambda, if x >  lambda
f(x) = | x + lambda, if x < -lambda
       | 0, otherwise
Parameters:the_lambda – lambda, default is 0.5
>>> softShrink = SoftShrink(1e-5)
creating: createSoftShrink
class bigdl.nn.layer.SoftSign(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Apply SoftSign function to an n-dimensional input Tensor.

SoftSign function: f_i(x) = x_i / (1+|x_i|)

>>> softSign = SoftSign()
creating: createSoftSign
class bigdl.nn.layer.SparseJoinTable(dimension, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

:: Experimental

Sparse version of JoinTable. Backward just pass the origin gradOutput back to the next layers without split. So this layer may just works in Wide&Deep like models.

Parameters:dimension – to be join in this dimension
>>> joinTable = SparseJoinTable(1)
creating: createSparseJoinTable
class bigdl.nn.layer.SparseLinear(input_size, output_size, with_bias=True, backwardStart=-1, backwardLength=-1, wRegularizer=None, bRegularizer=None, init_weight=None, init_bias=None, init_grad_weight=None, init_grad_bias=None, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

SparseLinear is the sparse version of module Linear. SparseLinear has two different from Linear: firstly, SparseLinear’s input Tensor is a SparseTensor. Secondly, SparseLinear doesn’t backward gradient to next layer in the backpropagation by default, as the gradInput of SparseLinear is useless and very big in most cases.

But, considering model like Wide&Deep, we provide backwardStart and backwardLength to backward part of the gradient to next layer.

:param input_size the size the each input sample :param output_size the size of the module output of each sample :param backwardStart backwardStart index, counting from 1 :param backwardLength backward length :param withBias if has bias :param wRegularizer: instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices. :param bRegularizer: instance of [[Regularizer]]applied to the bias. :param init_weight: the optional initial value for the weight :param init_bias: the optional initial value for the bias :param init_grad_weight: the optional initial value for the grad_weight :param init_grad_bias: the optional initial value for the grad_bias

>>> sparselinear = SparseLinear(100, 10, True, wRegularizer=L1Regularizer(0.5), bRegularizer=L1Regularizer(0.5))
creating: createL1Regularizer
creating: createL1Regularizer
creating: createSparseLinear
>>> import numpy as np
>>> init_weight = np.random.randn(10, 100)
>>> init_bias = np.random.randn(10)
>>> init_grad_weight = np.zeros([10, 100])
>>> init_grad_bias = np.zeros([10])
>>> sparselinear = SparseLinear(100, 10, True, 1, 5, L1Regularizer(0.5), L1Regularizer(0.5), init_weight, init_bias, init_grad_weight, init_grad_bias)
creating: createL1Regularizer
creating: createL1Regularizer
creating: createSparseLinear
>>> np.random.seed(123)
>>> init_weight = np.random.randn(5, 1000)
>>> init_bias = np.random.randn(5)
>>> sparselinear = SparseLinear(1000, 5, init_weight=init_weight, init_bias=init_bias)
creating: createSparseLinear
>>> input = JTensor.sparse(np.array([1, 3, 5, 2, 4, 6]), np.array([0, 0, 0, 1, 1, 1, 1, 5, 300, 2, 100, 500]), np.array([2, 1000]))
>>> print(sparselinear.forward(input))
[[ 10.09569263 -10.94844246  -4.1086688    1.02527523  11.80737209]
[  7.9651413    9.7131443  -10.22719955   0.02345783  -3.74368906]]
set_init_method(weight_init_method=None, bias_init_method=None)[source]
class bigdl.nn.layer.SpatialAveragePooling(kw, kh, dw=1, dh=1, pad_w=0, pad_h=0, global_pooling=False, ceil_mode=False, count_include_pad=True, divide=True, format='NCHW', bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies 2D average-pooling operation in kWxkH regions by step size dWxdH steps. The number of output features is equal to the number of input planes.

When padW and padH are both -1, we use a padding algorithm similar to the “SAME” padding of tensorflow. That is

outHeight = Math.ceil(inHeight.toFloat/strideH.toFloat) outWidth = Math.ceil(inWidth.toFloat/strideW.toFloat)

padAlongHeight = Math.max(0, (outHeight - 1) * strideH + kernelH - inHeight) padAlongWidth = Math.max(0, (outWidth - 1) * strideW + kernelW - inWidth)

padTop = padAlongHeight / 2 padLeft = padAlongWidth / 2

Parameters:
  • kW – kernel width
  • kH – kernel height
  • dW – step width
  • dH – step height
  • padW – padding width
  • padH – padding height
  • global_pooling – If globalPooling then it will pool over the size of the input by doing

kH = input->height and kW = input->width :param ceilMode: whether the output size is to be ceiled or floored :param countIncludePad: whether to include padding when dividing thenumber of elements in pooling region :param divide: whether to do the averaging :param format: “NCHW” or “NHWC”, indicating the input data format

>>> spatialAveragePooling = SpatialAveragePooling(7,7)
creating: createSpatialAveragePooling
>>> spatialAveragePooling = SpatialAveragePooling(2, 2, 2, 2, -1, -1, True, format="NHWC")
creating: createSpatialAveragePooling
set_weights(weights)[source]
class bigdl.nn.layer.SpatialBatchNormalization(n_output, eps=1e-05, momentum=0.1, affine=True, init_weight=None, init_bias=None, init_grad_weight=None, init_grad_bias=None, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

This file implements Batch Normalization as described in the paper: “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift” by Sergey Ioffe, Christian Szegedy This implementation is useful for inputs coming from convolution layers. For non-convolutional layers, see [[BatchNormalization]] The operation implemented is:

      ( x - mean(x) )
y = -------------------- * gamma + beta
   standard-deviation(x)

where gamma and beta are learnable parameters. The learning of gamma and beta is optional.

>>> spatialBatchNormalization = SpatialBatchNormalization(1)
creating: createSpatialBatchNormalization
>>> import numpy as np
>>> init_weight = np.array([1.0])
>>> init_grad_weight = np.array([0.0])
>>> init_bias = np.array([0.0])
>>> init_grad_bias = np.array([0.0])
>>> spatialBatchNormalization = SpatialBatchNormalization(1, 1e-5, 0.1, True, init_weight, init_bias, init_grad_weight, init_grad_bias)
creating: createSpatialBatchNormalization
set_init_method(weight_init_method=None, bias_init_method=None)[source]
class bigdl.nn.layer.SpatialContrastiveNormalization(n_input_plane=1, kernel=None, threshold=0.0001, thresval=0.0001, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Subtractive + divisive contrast normalization.

Parameters:
  • n_input_plane
  • kernel
  • threshold
  • thresval
>>> kernel = np.ones([9,9]).astype("float32")
>>> spatialContrastiveNormalization = SpatialContrastiveNormalization(1, kernel)
creating: createSpatialContrastiveNormalization
>>> spatialContrastiveNormalization = SpatialContrastiveNormalization()
creating: createSpatialContrastiveNormalization
class bigdl.nn.layer.SpatialConvolution(n_input_plane, n_output_plane, kernel_w, kernel_h, stride_w=1, stride_h=1, pad_w=0, pad_h=0, n_group=1, propagate_back=True, wRegularizer=None, bRegularizer=None, init_weight=None, init_bias=None, init_grad_weight=None, init_grad_bias=None, with_bias=True, data_format='NCHW', bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies a 2D convolution over an input image composed of several input planes. The input tensor in forward(input) is expected to be a 3D tensor (nInputPlane x height x width).

:param n_input_plane The number of expected input planes in the image given into forward() :param n_output_plane The number of output planes the convolution layer will produce. :param kernel_w The kernel width of the convolution :param kernel_h The kernel height of the convolution :param stride_w The step of the convolution in the width dimension. :param stride_h The step of the convolution in the height dimension :param pad_w The additional zeros added per width to the input planes. :param pad_h The additional zeros added per height to the input planes. :param n_group Kernel group number :param propagate_back Propagate gradient back :param wRegularizer: instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices. :param bRegularizer: instance of [[Regularizer]]applied to the bias. :param init_weight: the optional initial value for the weight :param init_bias: the optional initial value for the bias :param init_grad_weight: the optional initial value for the grad_weight :param init_grad_bias: the optional initial value for the grad_bias :param with_bias: the optional initial value for if need bias :param data_format: a string value of “NHWC” or “NCHW” to specify the input data format of this layer. In “NHWC” format data is stored in the order of [batch_size, height, width, channels], in “NCHW” format data is stored in the order of [batch_size, channels, height, width].

>>> spatialConvolution = SpatialConvolution(6, 12, 5, 5)
creating: createSpatialConvolution
>>> spatialConvolution.setWRegularizer(L1Regularizer(0.5))
creating: createL1Regularizer
>>> spatialConvolution.setBRegularizer(L1Regularizer(0.5))
creating: createL1Regularizer
>>> import numpy as np
>>> init_weight = np.random.randn(1, 12, 6, 5, 5)
>>> init_bias = np.random.randn(12)
>>> init_grad_weight = np.zeros([1, 12, 6, 5, 5])
>>> init_grad_bias = np.zeros([12])
>>> spatialConvolution = SpatialConvolution(6, 12, 5, 5, 1, 1, 0, 0, 1, True, L1Regularizer(0.5), L1Regularizer(0.5), init_weight, init_bias, init_grad_weight, init_grad_bias, True, "NCHW")
creating: createL1Regularizer
creating: createL1Regularizer
creating: createSpatialConvolution
set_init_method(weight_init_method=None, bias_init_method=None)[source]
class bigdl.nn.layer.SpatialConvolutionMap(conn_table, kw, kh, dw=1, dh=1, pad_w=0, pad_h=0, wRegularizer=None, bRegularizer=None, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

This class is a generalization of SpatialConvolution. It uses a generic connection table between input and output features. The SpatialConvolution is equivalent to using a full connection table.

When padW and padH are both -1, we use a padding algorithm similar to the “SAME” padding of tensorflow. That is

outHeight = Math.ceil(inHeight.toFloat/strideH.toFloat) outWidth = Math.ceil(inWidth.toFloat/strideW.toFloat)

padAlongHeight = Math.max(0, (outHeight - 1) * strideH + kernelH - inHeight) padAlongWidth = Math.max(0, (outWidth - 1) * strideW + kernelW - inWidth)

padTop = padAlongHeight / 2 padLeft = padAlongWidth / 2

Parameters:
  • wRegularizer – instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices.
  • bRegularizer – instance of [[Regularizer]]applied to the bias.
>>> ct = np.ones([9,9]).astype("float32")
>>> spatialConvolutionMap = SpatialConvolutionMap(ct, 9, 9)
creating: createSpatialConvolutionMap
class bigdl.nn.layer.SpatialCrossMapLRN(size=5, alpha=1.0, beta=0.75, k=1.0, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies Spatial Local Response Normalization between different feature maps. The operation implemented is:

                             x_f
y_f =  -------------------------------------------------
        (k+(alpha/size)* sum_{l=l1 to l2} (x_l^2^))^beta^

where x_f is the input at spatial locations h,w (not shown for simplicity) and feature map f, l1 corresponds to max(0,f-ceil(size/2)) and l2 to min(F, f-ceil(size/2) + size). Here, F is the number of feature maps.

Parameters:
  • size – the number of channels to sum over
  • alpha – the scaling parameter
  • beta – the exponent
  • k – a constant
>>> spatialCrossMapLRN = SpatialCrossMapLRN()
creating: createSpatialCrossMapLRN
class bigdl.nn.layer.SpatialDilatedConvolution(n_input_plane, n_output_plane, kw, kh, dw=1, dh=1, pad_w=0, pad_h=0, dilation_w=1, dilation_h=1, wRegularizer=None, bRegularizer=None, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Apply a 2D dilated convolution over an input image.

The input tensor is expected to be a 3D or 4D(with batch) tensor.

If input is a 3D tensor nInputPlane x height x width, owidth = floor(width + 2 * padW - dilationW * (kW-1) - 1) / dW + 1 oheight = floor(height + 2 * padH - dilationH * (kH-1) - 1) / dH + 1

Reference Paper: Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions[J]. arXiv preprint arXiv:1511.07122, 2015.

Parameters:
  • n_input_plane – The number of expected input planes in the image given into forward().
  • n_output_plane – The number of output planes the convolution layer will produce.
  • kw – The kernel width of the convolution.
  • kh – The kernel height of the convolution.
  • dw – The step of the convolution in the width dimension. Default is 1.
  • dh – The step of the convolution in the height dimension. Default is 1.
  • pad_w – The additional zeros added per width to the input planes. Default is 0.
  • pad_h – The additional zeros added per height to the input planes. Default is 0.
  • dilation_w – The number of pixels to skip. Default is 1.
  • dilation_h – The number of pixels to skip. Default is 1.
  • init_method – Init method, Default, Xavier.
  • wRegularizer – instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices.
  • bRegularizer – instance of [[Regularizer]]applied to the bias.
>>> spatialDilatedConvolution = SpatialDilatedConvolution(1, 1, 1, 1)
creating: createSpatialDilatedConvolution
set_init_method(weight_init_method=None, bias_init_method=None)[source]
class bigdl.nn.layer.SpatialDivisiveNormalization(n_input_plane=1, kernel=None, threshold=0.0001, thresval=0.0001, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies a spatial division operation on a series of 2D inputs using kernel for computing the weighted average in a neighborhood. The neighborhood is defined for a local spatial region that is the size as kernel and across all features. For an input image, since there is only one feature, the region is only spatial. For an RGB image, the weighted average is taken over RGB channels and a spatial region.

If the kernel is 1D, then it will be used for constructing and separable 2D kernel. The operations will be much more efficient in this case.

The kernel is generally chosen as a gaussian when it is believed that the correlation of two pixel locations decrease with increasing distance. On the feature dimension, a uniform average is used since the weighting across features is not known.

Parameters:
  • nInputPlane – number of input plane, default is 1.
  • kernel – kernel tensor, default is a 9 x 9 tensor.
  • threshold – threshold
  • thresval – threshhold value to replace withif data is smaller than theshold
>>> kernel = np.ones([9,9]).astype("float32")
>>> spatialDivisiveNormalization = SpatialDivisiveNormalization(2,kernel)
creating: createSpatialDivisiveNormalization
>>> spatialDivisiveNormalization = SpatialDivisiveNormalization()
creating: createSpatialDivisiveNormalization
class bigdl.nn.layer.SpatialFullConvolution(n_input_plane, n_output_plane, kw, kh, dw=1, dh=1, pad_w=0, pad_h=0, adj_w=0, adj_h=0, n_group=1, no_bias=False, wRegularizer=None, bRegularizer=None, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Apply a 2D full convolution over an input image. The input tensor is expected to be a 3D or 4D(with batch) tensor. Note that instead of setting adjW and adjH, SpatialFullConvolution[Table, T] also accepts a table input with two tensors: T(convInput, sizeTensor) where convInput is the standard input tensor, and the size of sizeTensor is used to set the size of the output (will ignore the adjW and adjH values used to construct the module). This module can be used without a bias by setting parameter noBias = true while constructing the module.

If input is a 3D tensor nInputPlane x height x width, owidth = (width - 1) * dW - 2*padW + kW + adjW oheight = (height - 1) * dH - 2*padH + kH + adjH

Other frameworks call this operation “In-network Upsampling”, “Fractionally-strided convolution”, “Backwards Convolution,” “Deconvolution”, or “Upconvolution.”

Reference Paper: Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 3431-3440.

:param nInputPlane The number of expected input planes in the image given into forward() :param nOutputPlane The number of output planes the convolution layer will produce. :param kW The kernel width of the convolution. :param kH The kernel height of the convolution. :param dW The step of the convolution in the width dimension. Default is 1. :param dH The step of the convolution in the height dimension. Default is 1. :param padW The additional zeros added per width to the input planes. Default is 0. :param padH The additional zeros added per height to the input planes. Default is 0. :param adjW Extra width to add to the output image. Default is 0. :param adjH Extra height to add to the output image. Default is 0. :param nGroup Kernel group number. :param noBias If bias is needed. :param initMethod Init method, Default, Xavier, Bilinear. :param wRegularizer: instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices. :param bRegularizer: instance of [[Regularizer]]applied to the bias.

>>> spatialFullConvolution = SpatialFullConvolution(1, 1, 1, 1)
creating: createSpatialFullConvolution
set_init_method(weight_init_method=None, bias_init_method=None)[source]
class bigdl.nn.layer.SpatialMaxPooling(kw, kh, dw, dh, pad_w=0, pad_h=0, to_ceil=False, format='NCHW', bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies 2D max-pooling operation in kWxkH regions by step size dWxdH steps. The number of output features is equal to the number of input planes. If the input image is a 3D tensor nInputPlane x height x width, the output image size will be nOutputPlane x oheight x owidth where owidth = op((width + 2*padW - kW) / dW + 1) oheight = op((height + 2*padH - kH) / dH + 1) op is a rounding operator. By default, it is floor. It can be changed by calling :ceil() or :floor() methods.

When padW and padH are both -1, we use a padding algorithm similar to the “SAME” padding of tensorflow. That is

outHeight = Math.ceil(inHeight.toFloat/strideH.toFloat) outWidth = Math.ceil(inWidth.toFloat/strideW.toFloat)

padAlongHeight = Math.max(0, (outHeight - 1) * strideH + kernelH - inHeight) padAlongWidth = Math.max(0, (outWidth - 1) * strideW + kernelW - inWidth)

padTop = padAlongHeight / 2 padLeft = padAlongWidth / 2

Parameters:
  • kW – kernel width
  • kH – kernel height
  • dW – step size in width
  • dH – step size in height
  • padW – padding in width
  • padH – padding in height
  • format – “NCHW” or “NHWC”, indicating the input data format
>>> spatialMaxPooling = SpatialMaxPooling(2, 2, 2, 2)
creating: createSpatialMaxPooling
>>> spatialMaxPooling = SpatialMaxPooling(2, 2, 2, 2, -1, -1, True, "NHWC")
creating: createSpatialMaxPooling
class bigdl.nn.layer.SpatialShareConvolution(n_input_plane, n_output_plane, kernel_w, kernel_h, stride_w=1, stride_h=1, pad_w=0, pad_h=0, n_group=1, propagate_back=True, wRegularizer=None, bRegularizer=None, init_weight=None, init_bias=None, init_grad_weight=None, init_grad_bias=None, with_bias=True, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

>>> spatialShareConvolution = SpatialShareConvolution(1, 1, 1, 1)
creating: createSpatialShareConvolution
>>> import numpy as np
>>> init_weight = np.random.randn(1, 12, 6, 5, 5)
>>> init_bias = np.random.randn(12)
>>> init_grad_weight = np.zeros([1, 12, 6, 5, 5])
>>> init_grad_bias = np.zeros([12])
>>> conv = SpatialShareConvolution(6, 12, 5, 5, 1, 1, 0, 0, 1, True, L1Regularizer(0.5), L1Regularizer(0.5), init_weight, init_bias, init_grad_weight, init_grad_bias)
creating: createL1Regularizer
creating: createL1Regularizer
creating: createSpatialShareConvolution
set_init_method(weight_init_method=None, bias_init_method=None)[source]
class bigdl.nn.layer.SpatialSubtractiveNormalization(n_input_plane=1, kernel=None, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies a spatial subtraction operation on a series of 2D inputs using kernel for computing the weighted average in a neighborhood. The neighborhood is defined for a local spatial region that is the size as kernel and across all features. For a an input image, since there is only one feature, the region is only spatial. For an RGB image, the weighted average is taken over RGB channels and a spatial region.

If the kernel is 1D, then it will be used for constructing and separable 2D kernel. The operations will be much more efficient in this case.

The kernel is generally chosen as a gaussian when it is believed that the correlation of two pixel locations decrease with increasing distance. On the feature dimension, a uniform average is used since the weighting across features is not known.

Parameters:
  • n_input_plane – number of input plane, default is 1.
  • kernel – kernel tensor, default is a 9 x 9 tensor.
>>> kernel = np.ones([9,9]).astype("float32")
>>> spatialSubtractiveNormalization = SpatialSubtractiveNormalization(2,kernel)
creating: createSpatialSubtractiveNormalization
>>> spatialSubtractiveNormalization = SpatialSubtractiveNormalization()
creating: createSpatialSubtractiveNormalization
class bigdl.nn.layer.SpatialWithinChannelLRN(size=5, alpha=1.0, beta=0.75, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

The local response normalization layer performs a kind of lateral inhibition by normalizing over local input regions. the local regions extend spatially, in separate channels (i.e., they have shape 1 x local_size x local_size).

:param size the side length of the square region to sum over :param alpha the scaling parameter :param beta the exponent

>>> layer = SpatialWithinChannelLRN()
creating: createSpatialWithinChannelLRN
class bigdl.nn.layer.SpatialZeroPadding(pad_left, pad_right, pad_top, pad_bottom, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Each feature map of a given input is padded with specified number of zeros. If padding values are negative, then input is cropped.

Parameters:
  • padLeft – pad left position
  • padRight – pad right position
  • padTop – pad top position
  • padBottom – pad bottom position
>>> spatialZeroPadding = SpatialZeroPadding(1, 1, 1, 1)
creating: createSpatialZeroPadding
class bigdl.nn.layer.SplitTable(dimension, n_input_dims=-1, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Creates a module that takes a Tensor as input and outputs several tables, splitting the Tensor along the specified dimension dimension. Please note the dimension starts from 1.

The input to this layer is expected to be a tensor, or a batch of tensors; when using mini-batch, a batch of sample tensors will be passed to the layer and the user needs to specify the number of dimensions of each sample tensor in a batch using nInputDims.

Parameters:
  • dimension – to be split along this dimension
  • n_input_dims – specify the number of dimensions that this module will receiveIf it is more than the dimension of input tensors, the first dimensionwould be considered as batch size
>>> splitTable = SplitTable(1, 1)
creating: createSplitTable
class bigdl.nn.layer.Sqrt(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Apply an element-wise sqrt operation.

>>> sqrt = Sqrt()
creating: createSqrt
class bigdl.nn.layer.Square(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Apply an element-wise square operation.

>>> square = Square()
creating: createSquare
class bigdl.nn.layer.Squeeze(dim, num_input_dims=-2147483648, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Delete singleton all dimensions or a specific dim.

Parameters:
  • dim – Optional. The dimension to be delete. Default: delete all dimensions.
  • num_input_dims – Optional. If in a batch model, set to the inputDims.
>>> squeeze = Squeeze(1)
creating: createSqueeze
class bigdl.nn.layer.Sum(dimension=1, n_input_dims=-1, size_average=False, squeeze=True, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

It is a simple layer which applies a sum operation over the given dimension. When nInputDims is provided, the input will be considered as a batches. Then the sum operation will be applied in (dimension + 1) The input to this layer is expected to be a tensor, or a batch of tensors; when using mini-batch, a batch of sample tensors will be passed to the layer and the user need to specify the number of dimensions of each sample tensor in the batch using nInputDims.

Parameters:
  • dimension – the dimension to be applied sum operation
  • n_input_dims – specify the number of dimensions that this module will receiveIf it is more than the dimension of input tensors, the first dimensionwould be considered as batch size
  • size_average – default is false, if it is true, it will return the mean instead
  • squeeze – default is true, which will squeeze the sum dimension; set it to false to keep the sum dimension
>>> sum = Sum(1, 1, True, True)
creating: createSum
class bigdl.nn.layer.Tanh(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies the Tanh function element-wise to the input Tensor, thus outputting a Tensor of the same dimension. Tanh is defined as f(x) = (exp(x)-exp(-x))/(exp(x)+exp(-x)).

>>> tanh = Tanh()
creating: createTanh
class bigdl.nn.layer.TanhShrink(bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

A simple layer for each element of the input tensor, do the following operation during the forward process: [f(x) = tanh(x) - 1]

>>> tanhShrink = TanhShrink()
creating: createTanhShrink
class bigdl.nn.layer.TemporalConvolution(input_frame_size, output_frame_size, kernel_w, stride_w=1, propagate_back=True, weight_regularizer=None, bias_regularizer=None, init_weight=None, init_bias=None, init_grad_weight=None, init_grad_bias=None, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies a 1D convolution over an input sequence composed of nInputFrame frames.. The input tensor in forward(input) is expected to be a 2D tensor (nInputFrame x inputFrameSize) or a 3D tensor (nBatchFrame x nInputFrame x inputFrameSize).

:param input_frame_size The input frame size expected in sequences given into forward() :param output_frame_size The output frame size the convolution layer will produce. :param kernel_w The kernel width of the convolution :param stride_w The step of the convolution in the width dimension. :param propagate_back Whether propagate gradient back, default is true. :param weight_regularizer instance of [[Regularizer]] (eg. L1 or L2 regularization), applied to the input weights matrices. :param bias_regularizer instance of [[Regularizer]] applied to the bias. :param init_weight Initial weight :param init_bias Initial bias :param init_grad_weight Initial gradient weight :param init_grad_bias Initial gradient bias

>>> temporalConvolution = TemporalConvolution(6, 12, 5, 5)
creating: createTemporalConvolution
>>> temporalConvolution.setWRegularizer(L1Regularizer(0.5))
creating: createL1Regularizer
>>> temporalConvolution.setBRegularizer(L1Regularizer(0.5))
creating: createL1Regularizer
set_init_method(weight_init_method=None, bias_init_method=None)[source]
class bigdl.nn.layer.TemporalMaxPooling(k_w, d_w, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies 1D max-pooling operation in kW regions by step size dW steps. Input sequence composed of nInputFrame frames. The input tensor in forward(input) is expected to be a 2D tensor (nInputFrame x inputFrameSize) or a 3D tensor (nBatchFrame x nInputFrame x inputFrameSize).

If the input sequence is a 2D tensor of dimension nInputFrame x inputFrameSize, the output sequence will be nOutputFrame x inputFrameSize where

nOutputFrame = (nInputFrame - k_w) / d_w + 1

Parameters:
  • k_w – kernel width
  • d_w – step size in width
>>> temporalMaxPooling = TemporalMaxPooling(2, 2)
creating: createTemporalMaxPooling
class bigdl.nn.layer.Threshold(th=1e-06, v=0.0, ip=False, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Threshold input Tensor. If values in the Tensor smaller than th, then replace it with v

Parameters:
  • th – the threshold to compare with
  • v – the value to replace with
  • ip – inplace mode
>>> threshold = Threshold(1e-5, 1e-5, True)
creating: createThreshold
class bigdl.nn.layer.TimeDistributed(model, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

This layer is intended to apply contained layer to each temporal time slice of input tensor.

For instance, The TimeDistributed Layer can feed each time slice of input tensor to the Linear layer.

The input data format is [Batch, Time, Other dims]. For the contained layer, it must not change the Other dims length.

>>> td = TimeDistributed(Linear(2, 3))
creating: createLinear
creating: createTimeDistributed
class bigdl.nn.layer.Transpose(permutations, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Transpose input along specified dimensions

Parameters:permutations – dimension pairs that need to swap
>>> transpose = Transpose([(1,2)])
creating: createTranspose
class bigdl.nn.layer.Unsqueeze(pos, num_input_dims=-2147483648, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Create an Unsqueeze layer. Insert singleton dim (i.e., dimension 1) at position pos. For an input with dim = input.dim(), there are dim + 1 possible positions to insert the singleton dimension.

Parameters:
  • pos – The position will be insert singleton.
  • num_input_dims – Optional. If in a batch model, set to the inputDim
>>> unsqueeze = Unsqueeze(1, 1)
creating: createUnsqueeze
class bigdl.nn.layer.View(sizes, num_input_dims=0, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

This module creates a new view of the input tensor using the sizes passed to the constructor. The method setNumInputDims() allows to specify the expected number of dimensions of the inputs of the modules. This makes it possible to use minibatch inputs when using a size -1 for one of the dimensions.

Parameters:size – sizes use for creates a new view
>>> view = View([1024,2])
creating: createView
class bigdl.nn.layer.VolumetricConvolution(n_input_plane, n_output_plane, k_t, k_w, k_h, d_t=1, d_w=1, d_h=1, pad_t=0, pad_w=0, pad_h=0, with_bias=True, wRegularizer=None, bRegularizer=None, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies a 3D convolution over an input image composed of several input planes. The input tensor in forward(input) is expected to be a 4D tensor (nInputPlane x time x height x width).

Parameters:
  • n_input_plane – The number of expected input planes in the image given into forward()
  • n_output_plane – The number of output planes the convolution layer will produce.
  • k_t – The kernel size of the convolution in time
  • k_w – The kernel width of the convolution
  • k_h – The kernel height of the convolution
  • d_t – The step of the convolution in the time dimension. Default is 1
  • d_w – The step of the convolution in the width dimension. Default is 1
  • d_h – The step of the convolution in the height dimension. Default is 1
  • pad_t – Additional zeros added to the input plane data on both sides of time axis.Default is 0. (kT-1)/2 is often used here.
  • pad_w – The additional zeros added per width to the input planes.
  • pad_h – The additional zeros added per height to the input planes.
  • with_bias – whether with bias
  • wRegularizer – instance of [[Regularizer]] (eg. L1 or L2 regularization), applied to the input weights matrices.
  • bRegularizer – instance of [[Regularizer]] applied to the bias.
>>> volumetricConvolution = VolumetricConvolution(6, 12, 5, 5, 5, 1, 1, 1)
creating: createVolumetricConvolution
set_init_method(weight_init_method=None, bias_init_method=None)[source]
class bigdl.nn.layer.VolumetricFullConvolution(n_input_plane, n_output_plane, kt, kw, kh, dt=1, dw=1, dh=1, pad_t=0, pad_w=0, pad_h=0, adj_t=0, adj_w=0, adj_h=0, n_group=1, no_bias=False, wRegularizer=None, bRegularizer=None, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Apply a 3D full convolution over an 3D input image, a sequence of images, or a video etc. The input tensor is expected to be a 4D or 5D(with batch) tensor. Note that instead of setting adjT, adjW and adjH, VolumetricFullConvolution also accepts a table input with two tensors: T(convInput, sizeTensor) where convInput is the standard input tensor, and the size of sizeTensor is used to set the size of the output (will ignore the adjT, adjW and adjH values used to construct the module). This module can be used without a bias by setting parameter noBias = true while constructing the module.

If input is a 4D tensor nInputPlane x depth x height x width, odepth = (depth - 1) * dT - 2*padt + kT + adjT owidth = (width - 1) * dW - 2*padW + kW + adjW oheight = (height - 1) * dH - 2*padH + kH + adjH

Other frameworks call this operation “In-network Upsampling”, “Fractionally-strided convolution”, “Backwards Convolution,” “Deconvolution”, or “Upconvolution.”

Reference Paper: Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 3431-3440.

:param nInputPlane The number of expected input planes in the image given into forward() :param nOutputPlane The number of output planes the convolution layer will produce. :param kT The kernel depth of the convolution. :param kW The kernel width of the convolution. :param kH The kernel height of the convolution. :param dT The step of the convolution in the depth dimension. Default is 1. :param dW The step of the convolution in the width dimension. Default is 1. :param dH The step of the convolution in the height dimension. Default is 1. :param padT The additional zeros added per depth to the input planes. Default is 0. :param padW The additional zeros added per width to the input planes. Default is 0. :param padH The additional zeros added per height to the input planes. Default is 0. :param adjT Extra depth to add to the output image. Default is 0. :param adjW Extra width to add to the output image. Default is 0. :param adjH Extra height to add to the output image. Default is 0. :param nGroup Kernel group number. :param noBias If bias is needed. :param wRegularizer: instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices. :param bRegularizer: instance of [[Regularizer]]applied to the bias.

>>> volumetricFullConvolution = VolumetricFullConvolution(1, 1, 1, 1, 1, 1)
creating: createVolumetricFullConvolution
set_init_method(weight_init_method=None, bias_init_method=None)[source]
class bigdl.nn.layer.VolumetricMaxPooling(k_t, k_w, k_h, d_t, d_w, d_h, pad_t=0, pad_w=0, pad_h=0, bigdl_type='float')[source]

Bases: bigdl.nn.layer.Layer

Applies 3D max-pooling operation in kTxkWxkH regions by step size dTxdWxdH. The number of output features is equal to the number of input planes / dT. The input can optionally be padded with zeros. Padding should be smaller than half of kernel size. That is, padT < kT/2, padW < kW/2 and padH < kH/2

Parameters:
  • k_t – The kernel size
  • k_w – The kernel width
  • k_h – The kernel height
  • d_t – The step in the time dimension
  • d_w – The step in the width dimension
  • d_h – The step in the height dimension
  • pad_t – The padding in the time dimension
  • pad_w – The padding in the width dimension
  • pad_h – The padding in the height dimension
>>> volumetricMaxPooling = VolumetricMaxPooling(5, 5, 5, 1, 1, 1)
creating: createVolumetricMaxPooling

Module contents