bigdl.nn package¶
Submodules¶
bigdl.nn.criterion module¶
-
class
bigdl.nn.criterion.
AbsCriterion
(size_average=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
measures the mean absolute value of the element-wise difference between input
>>> absCriterion = AbsCriterion(True) creating: createAbsCriterion
-
class
bigdl.nn.criterion.
BCECriterion
(weights=None, size_average=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
Creates a criterion that measures the Binary Cross Entropy between the target and the output
Parameters: - weights – weights for each class
- sizeAverage – whether to average the loss or not
>>> np.random.seed(123) >>> weights = np.random.uniform(0, 1, (2,)).astype("float32") >>> bCECriterion = BCECriterion(weights) creating: createBCECriterion >>> bCECriterion = BCECriterion() creating: createBCECriterion
-
class
bigdl.nn.criterion.
ClassNLLCriterion
(weights=None, size_average=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
The negative log likelihood criterion. It is useful to train a classification problem with n classes. If provided, the optional argument weights should be a 1D Tensor assigning weight to each of the classes. This is particularly useful when you have an unbalanced training set.
The input given through a forward() is expected to contain log-probabilities of each class: input has to be a 1D Tensor of size n. Obtaining log-probabilities in a neural network is easily achieved by adding a LogSoftMax layer in the last layer of your neural network. You may use CrossEntropyCriterion instead, if you prefer not to add an extra layer to your network. This criterion expects a class index (1 to the number of class) as target when calling forward(input, target) and backward(input, target).
The loss can be described as: loss(x, class) = -x[class] or in the case of the weights argument it is specified as follows: loss(x, class) = -weights[class] * x[class] Due to the behaviour of the backend code, it is necessary to set sizeAverage to false when calculating losses in non-batch mode.
Note that if the target is -1, the training process will skip this sample. In other will, the forward process will return zero output and the backward process will also return zero gradInput.
By default, the losses are averaged over observations for each minibatch. However, if the field sizeAverage is set to false, the losses are instead summed for each minibatch.
Parameters: - weights – weights of each class
- size_average – whether to average or not
>>> np.random.seed(123) >>> weights = np.random.uniform(0, 1, (2,)).astype("float32") >>> classNLLCriterion = ClassNLLCriterion(weights,True) creating: createClassNLLCriterion >>> classNLLCriterion = ClassNLLCriterion() creating: createClassNLLCriterion
-
class
bigdl.nn.criterion.
ClassSimplexCriterion
(n_classes, bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
ClassSimplexCriterion implements a criterion for classification. It learns an embedding per class, where each class’ embedding is a point on an (N-1)-dimensional simplex, where N is the number of classes.
Parameters: nClasses – the number of classes. >>> classSimplexCriterion = ClassSimplexCriterion(2) creating: createClassSimplexCriterion
-
class
bigdl.nn.criterion.
CosineDistanceCriterion
(size_average=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
Creates a criterion that measures the loss given an input and target, Loss = 1 - cos(x, y)
>>> cosineDistanceCriterion = CosineDistanceCriterion(True) creating: createCosineDistanceCriterion >>> cosineDistanceCriterion.forward(np.array([1.0, 2.0, 3.0, 4.0, 5.0]), ... np.array([5.0, 4.0, 3.0, 2.0, 1.0])) 0.07272728
-
class
bigdl.nn.criterion.
CosineEmbeddingCriterion
(margin=0.0, size_average=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
Creates a criterion that measures the loss given an input x = {x1, x2}, a table of two Tensors, and a Tensor label y with values 1 or -1.
Parameters: margin – a number from -1 to 1, 0 to 0.5 is suggested >>> cosineEmbeddingCriterion = CosineEmbeddingCriterion(1e-5, True) creating: createCosineEmbeddingCriterion >>> cosineEmbeddingCriterion.forward([np.array([1.0, 2.0, 3.0, 4.0, 5.0]), ... np.array([5.0, 4.0, 3.0, 2.0, 1.0])], ... [np.ones(5)]) 0.0
-
class
bigdl.nn.criterion.
Criterion
(jvalue, bigdl_type, *args)[source]¶ Bases:
bigdl.util.common.JavaValue
Criterion is helpful to train a neural network. Given an input and a target, they compute a gradient according to a given loss function.
-
backward
(input, target)[source]¶ NB: It’s for debug only, please use optimizer.optimize() in production. Performs a back-propagation step through the criterion, with respect to the given input.
Parameters: - input – ndarray or list of ndarray
- target – ndarray or list of ndarray
Returns: ndarray
-
forward
(input, target)[source]¶ NB: It’s for debug only, please use optimizer.optimize() in production. Takes an input object, and computes the corresponding loss of the criterion, compared with target
Parameters: - input – ndarray or list of ndarray
- target – ndarray or list of ndarray
Returns: value of loss
-
-
class
bigdl.nn.criterion.
CrossEntropyCriterion
(weights=None, size_average=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
This criterion combines LogSoftMax and ClassNLLCriterion in one single class.
Parameters: weights – A tensor assigning weight to each of the classes >>> np.random.seed(123) >>> weights = np.random.uniform(0, 1, (2,)).astype("float32") >>> cec = CrossEntropyCriterion(weights) creating: createCrossEntropyCriterion >>> cec = CrossEntropyCriterion() creating: createCrossEntropyCriterion
-
class
bigdl.nn.criterion.
DiceCoefficientCriterion
(size_average=True, epsilon=1.0, bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
The Dice-Coefficient criterion input: Tensor,target: Tensor
return: 2 * (input intersection target) 1 - ---------------------------------- input union target
>>> diceCoefficientCriterion = DiceCoefficientCriterion(size_average = True, epsilon = 1.0) creating: createDiceCoefficientCriterion >>> diceCoefficientCriterion = DiceCoefficientCriterion() creating: createDiceCoefficientCriterion
-
class
bigdl.nn.criterion.
DistKLDivCriterion
(size_average=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
The Kullback-Leibler divergence criterion
Parameters: sizeAverage – >>> distKLDivCriterion = DistKLDivCriterion(True) creating: createDistKLDivCriterion
-
class
bigdl.nn.criterion.
HingeEmbeddingCriterion
(margin=1.0, size_average=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
Creates a criterion that measures the loss given an input x which is a 1-dimensional vector and a label y (1 or -1). This is usually used for measuring whether two inputs are similar or dissimilar, e.g. using the L1 pairwise distance, and is typically used for learning nonlinear embeddings or semi-supervised learning.
If x and y are n-dimensional Tensors, the sum operation still operates over all the elements, and divides by n (this can be avoided if one sets the internal variable sizeAverage to false). The margin has a default value of 1, or can be set in the constructor.
>>> hingeEmbeddingCriterion = HingeEmbeddingCriterion(1e-5, True) creating: createHingeEmbeddingCriterion
-
class
bigdl.nn.criterion.
L1Cost
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
compute L1 norm for input, and sign of input
>>> l1Cost = L1Cost() creating: createL1Cost
-
class
bigdl.nn.criterion.
L1HingeEmbeddingCriterion
(margin=1.0, bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
Creates a criterion that measures the loss given an input x = {x1, x2}, a table of two Tensors, and a label y (1 or -1):
Parameters: margin – >>> l1HingeEmbeddingCriterion = L1HingeEmbeddingCriterion(1e-5) creating: createL1HingeEmbeddingCriterion >>> l1HingeEmbeddingCriterion = L1HingeEmbeddingCriterion() creating: createL1HingeEmbeddingCriterion >>> input1 = np.array([2.1, -2.2]) >>> input2 = np.array([-0.55, 0.298]) >>> input = [input1, input2] >>> target = np.array([1.0]) >>> result = l1HingeEmbeddingCriterion.forward(input, target) >>> (result == 5.148) True
-
class
bigdl.nn.criterion.
MSECriterion
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
Creates a criterion that measures the mean squared error between n elements in the input x and output y:
loss(x, y) = 1/n \sum |x_i - y_i|^2
If x and y are d-dimensional Tensors with a total of n elements, the sum operation still operates over all the elements, and divides by n. The two Tensors must have the same number of elements (but their sizes might be different). The division by n can be avoided if one sets the internal variable sizeAverage to false. By default, the losses are averaged over observations for each minibatch. However, if the field sizeAverage is set to false, the losses are instead summed.
>>> mSECriterion = MSECriterion() creating: createMSECriterion
-
class
bigdl.nn.criterion.
MarginCriterion
(margin=1.0, size_average=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
Creates a criterion that optimizes a two-class classification hinge loss (margin-based loss) between input x (a Tensor of dimension 1) and output y.
Parameters: - margin – if unspecified, is by default 1.
- size_average – size average in a mini-batch
>>> marginCriterion = MarginCriterion(1e-5, True) creating: createMarginCriterion
-
class
bigdl.nn.criterion.
MarginRankingCriterion
(margin=1.0, size_average=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
Creates a criterion that measures the loss given an input x = {x1, x2}, a table of two Tensors of size 1 (they contain only scalars), and a label y (1 or -1). In batch mode, x is a table of two Tensors of size batchsize, and y is a Tensor of size batchsize containing 1 or -1 for each corresponding pair of elements in the input Tensor. If y == 1 then it assumed the first input should be ranked higher (have a larger value) than the second input, and vice-versa for y == -1.
Parameters: margin – >>> marginRankingCriterion = MarginRankingCriterion(1e-5, True) creating: createMarginRankingCriterion
-
class
bigdl.nn.criterion.
MultiCriterion
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
a weighted sum of other criterions each applied to the same input and target
>>> multiCriterion = MultiCriterion() creating: createMultiCriterion >>> mSECriterion = MSECriterion() creating: createMSECriterion >>> multiCriterion = multiCriterion.add(mSECriterion) >>> multiCriterion = multiCriterion.add(mSECriterion)
-
class
bigdl.nn.criterion.
MultiLabelMarginCriterion
(size_average=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
Creates a criterion that optimizes a multi-class multi-classification hinge loss ( margin-based loss) between input x and output y (which is a Tensor of target class indices)
Parameters: size_average – size average in a mini-batch >>> multiLabelMarginCriterion = MultiLabelMarginCriterion(True) creating: createMultiLabelMarginCriterion
-
class
bigdl.nn.criterion.
MultiLabelSoftMarginCriterion
(weights=None, size_average=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
A MultiLabel multiclass criterion based on sigmoid: the loss is:
l(x,y) = - sum_i y[i] * log(p[i]) + (1 - y[i]) * log (1 - p[i])
where p[i] = exp(x[i]) / (1 + exp(x[i])) and with weights:
l(x,y) = - sum_i weights[i] (y[i] * log(p[i]) + (1 - y[i]) * log (1 - p[i]))
>>> np.random.seed(123) >>> weights = np.random.uniform(0, 1, (2,)).astype("float32") >>> multiLabelSoftMarginCriterion = MultiLabelSoftMarginCriterion(weights) creating: createMultiLabelSoftMarginCriterion >>> multiLabelSoftMarginCriterion = MultiLabelSoftMarginCriterion() creating: createMultiLabelSoftMarginCriterion
-
class
bigdl.nn.criterion.
MultiMarginCriterion
(p=1, weights=None, margin=1.0, size_average=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
Creates a criterion that optimizes a multi-class classification hinge loss (margin-based loss) between input x and output y (which is a target class index).
Parameters: - p –
- weights –
- margin –
- size_average –
>>> np.random.seed(123) >>> weights = np.random.uniform(0, 1, (2,)).astype("float32") >>> multiMarginCriterion = MultiMarginCriterion(1,weights) creating: createMultiMarginCriterion >>> multiMarginCriterion = MultiMarginCriterion() creating: createMultiMarginCriterion
-
class
bigdl.nn.criterion.
ParallelCriterion
(repeat_target=False, bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
ParallelCriterion is a weighted sum of other criterions each applied to a different input and target. Set repeatTarget = true to share the target for criterions.
Use add(criterion[, weight]) method to add criterion. Where weight is a scalar(default 1).
Parameters: repeat_target – Whether to share the target for all criterions. >>> parallelCriterion = ParallelCriterion(True) creating: createParallelCriterion >>> mSECriterion = MSECriterion() creating: createMSECriterion >>> parallelCriterion = parallelCriterion.add(mSECriterion) >>> parallelCriterion = parallelCriterion.add(mSECriterion)
-
class
bigdl.nn.criterion.
SmoothL1Criterion
(size_average=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
Creates a criterion that can be thought of as a smooth version of the AbsCriterion. It uses a squared term if the absolute element-wise error falls below 1. It is less sensitive to outliers than the MSECriterion and in some cases prevents exploding gradients (e.g. see “Fast R-CNN” paper by Ross Girshick).
| 0.5 * (x_i - y_i)^2^, if |x_i - y_i| < 1 loss(x, y) = 1/n \sum | | |x_i - y_i| - 0.5, otherwise
If x and y are d-dimensional Tensors with a total of n elements, the sum operation still operates over all the elements, and divides by n. The division by n can be avoided if one sets the internal variable sizeAverage to false
Parameters: size_average – whether to average the loss >>> smoothL1Criterion = SmoothL1Criterion(True) creating: createSmoothL1Criterion
-
class
bigdl.nn.criterion.
SmoothL1CriterionWithWeights
(sigma, num=0, bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
a smooth version of the AbsCriterion It uses a squared term if the absolute element-wise error falls below 1. It is less sensitive to outliers than the MSECriterion and in some cases prevents exploding gradients (e.g. see “Fast R-CNN” paper by Ross Girshick).
d = (x - y) * w_in loss(x, y, w_in, w_out) | 0.5 * (sigma * d_i)^2 * w_out if |d_i| < 1 / sigma / sigma = 1/n \sum | | (|d_i| - 0.5 / sigma / sigma) * w_out otherwise
>>> smoothL1CriterionWithWeights = SmoothL1CriterionWithWeights(1e-5, 1) creating: createSmoothL1CriterionWithWeights
-
class
bigdl.nn.criterion.
SoftMarginCriterion
(size_average=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
Creates a criterion that optimizes a two-class classification logistic loss between input x (a Tensor of dimension 1) and output y (which is a tensor containing either 1s or -1s).
loss(x, y) = sum_i (log(1 + exp(-y[i]*x[i]))) / x:nElement()
Parameters: sizeaverage – The normalization by the number of elements in the inputcan be disabled by setting >>> softMarginCriterion = SoftMarginCriterion(False) creating: createSoftMarginCriterion >>> softMarginCriterion = SoftMarginCriterion() creating: createSoftMarginCriterion
-
class
bigdl.nn.criterion.
SoftmaxWithCriterion
(ignore_label=None, normalize_mode='VALID', bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
Computes the multinomial logistic loss for a one-of-many classification task, passing real-valued predictions through a softmax to get a probability distribution over classes. It should be preferred over separate SoftmaxLayer + MultinomialLogisticLossLayer as its gradient computation is more numerically stable.
Parameters: - ignoreLabel – (optional) Specify a label value thatshould be ignored when computing the loss.
- normalizeMode – How to normalize the output loss.
>>> softmaxWithCriterion = SoftmaxWithCriterion() creating: createSoftmaxWithCriterion >>> softmaxWithCriterion = SoftmaxWithCriterion(1, "FULL") creating: createSoftmaxWithCriterion
-
class
bigdl.nn.criterion.
TimeDistributedCriterion
(criterion, size_average=False, bigdl_type='float')[source]¶ Bases:
bigdl.nn.criterion.Criterion
This class is intended to support inputs with 3 or more dimensions. Apply Any Provided Criterion to every temporal slice of an input.
Parameters: - criterion – embedded criterion
- size_average – whether to divide the sequence length
>>> td = TimeDistributedCriterion(ClassNLLCriterion()) creating: createClassNLLCriterion creating: createTimeDistributedCriterion
bigdl.nn.initialization_method module¶
-
class
bigdl.nn.initialization_method.
BilinearFiller
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.initialization_method.InitializationMethod
Initialize the weight with coefficients for bilinear interpolation.
A common use case is with the DeconvolutionLayer acting as upsampling. The variable tensor passed in the init function should have 5 dimensions of format [nGroup, nInput, nOutput, kH, kW], and kH should be equal to kW
-
class
bigdl.nn.initialization_method.
ConstInitMethod
(value, bigdl_type='float')[source]¶ Bases:
bigdl.nn.initialization_method.InitializationMethod
Initializer that generates tensors with certain constant double.
-
class
bigdl.nn.initialization_method.
InitializationMethod
(jvalue, bigdl_type, *args)[source]¶ Bases:
bigdl.util.common.JavaValue
Initialization method to initialize bias and weight. The init method will be called in Module.reset()
-
class
bigdl.nn.initialization_method.
Ones
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.initialization_method.InitializationMethod
Initializer that generates tensors with ones.
-
class
bigdl.nn.initialization_method.
RandomNormal
(mean, stdv, bigdl_type='float')[source]¶ Bases:
bigdl.nn.initialization_method.InitializationMethod
Initializer that generates tensors with a normal distribution.
-
class
bigdl.nn.initialization_method.
RandomUniform
(upper=None, lower=None, bigdl_type='float')[source]¶ Bases:
bigdl.nn.initialization_method.InitializationMethod
Initializer that generates tensors with a uniform distribution. It draws samples from a uniform distribution within [lower, upper] If lower and upper is not specified, it draws samples form a uniform distribution within [-limit, limit] where “limit” is “1/sqrt(fan_in)”
-
class
bigdl.nn.initialization_method.
Xavier
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.initialization_method.InitializationMethod
Xavier Initializer. See http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf
-
class
bigdl.nn.initialization_method.
Zeros
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.initialization_method.InitializationMethod
Initializer that generates tensors with zeros.
bigdl.nn.layer module¶
-
class
bigdl.nn.layer.
Abs
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
an element-wise abs operation
>>> abs = Abs() creating: createAbs
-
class
bigdl.nn.layer.
Add
(input_size, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
adds a bias term to input data ;
Parameters: input_size – size of input data >>> add = Add(1) creating: createAdd
-
class
bigdl.nn.layer.
AddConstant
(constant_scalar, inplace=False, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
adding a constant
Parameters: - constant_scalar – constant value
- inplace – Can optionally do its operation in-place without using extra state memory
>>> addConstant = AddConstant(1e-5, True) creating: createAddConstant
-
class
bigdl.nn.layer.
BatchNormalization
(n_output, eps=1e-05, momentum=0.1, affine=True, init_weight=None, init_bias=None, init_grad_weight=None, init_grad_bias=None, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
This layer implements Batch Normalization as described in the paper: “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift” by Sergey Ioffe, Christian Szegedy https://arxiv.org/abs/1502.03167
This implementation is useful for inputs NOT coming from convolution layers. For convolution layers, use nn.SpatialBatchNormalization.
The operation implemented is:
( x - mean(x) ) y = -------------------- * gamma + beta standard-deviation(x)
where gamma and beta are learnable parameters.The learning of gamma and beta is optional.
Parameters: - n_output – output feature map number
- eps – avoid divide zero
- momentum – momentum for weight update
- affine – affine operation on output or not
>>> batchNormalization = BatchNormalization(1, 1e-5, 1e-5, True) creating: createBatchNormalization >>> import numpy as np >>> init_weight = np.random.randn(2) >>> init_grad_weight = np.zeros([2]) >>> init_bias = np.zeros([2]) >>> init_grad_bias = np.zeros([2]) >>> batchNormalization = BatchNormalization(2, 1e-5, 1e-5, True, init_weight, init_bias, init_grad_weight, init_grad_bias) creating: createBatchNormalization
-
class
bigdl.nn.layer.
BiRecurrent
(merge=None, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Container
Create a Bidirectional recurrent layer
Parameters: merge – merge layer >>> biRecurrent = BiRecurrent(CAddTable()) creating: createCAddTable creating: createBiRecurrent >>> biRecurrent = BiRecurrent() creating: createBiRecurrent
-
class
bigdl.nn.layer.
Bilinear
(input_size1, input_size2, output_size, bias_res=True, wRegularizer=None, bRegularizer=None, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
a bilinear transformation with sparse inputs, The input tensor given in forward(input) is a table containing both inputs x_1 and x_2, which are tensors of size N x inputDimension1 and N x inputDimension2, respectively.
:param input_size1 input dimension of x_1 :param input_size2 input dimension of x_2 :param output_size output dimension :param bias_res whether use bias :param wRegularizer: instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices. :param bRegularizer: instance of [[Regularizer]]applied to the bias.
>>> bilinear = Bilinear(1, 1, 1, True, L1Regularizer(0.5)) creating: createL1Regularizer creating: createBilinear
-
class
bigdl.nn.layer.
BinaryTreeLSTM
(input_size, hidden_size, gate_output=True, with_graph=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
This class is an implementation of Binary TreeLSTM (Constituency Tree LSTM). :param inputSize input units size :param hiddenSize hidden units size :param gateOutput whether gate output :param withGraph whether create lstms with [[Graph]], the default value is true. >>> treeLSTM = BinaryTreeLSTM(100, 200) creating: createBinaryTreeLSTM
-
class
bigdl.nn.layer.
Bottle
(module, n_input_dim=2, n_output_dim1=2147483647, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Container
Bottle allows varying dimensionality input to be forwarded through any module that accepts input of nInputDim dimensions, and generates output of nOutputDim dimensions.
Parameters: - module – transform module
- n_input_dim – nInputDim dimensions of module
- n_output_dim1 – output of nOutputDim dimensions
>>> bottle = Bottle(Linear(100,10), 1, 1) creating: createLinear creating: createBottle
-
class
bigdl.nn.layer.
CAdd
(size, bRegularizer=None, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
This layer has a bias tensor with given size. The bias will be added element wise to the input tensor. If the element number of the bias tensor match the input tensor, a simply element wise will be done. Or the bias will be expanded to the same size of the input. The expand means repeat on unmatched singleton dimension(if some unmatched dimension isn’t singleton dimension, it will report an error). If the input is a batch, a singleton dimension will be add to the first dimension before the expand.
Parameters: - size – the size of the bias
- bRegularizer – instance of [[Regularizer]]applied to the bias.
>>> cAdd = CAdd([1,2]) creating: createCAdd
-
class
bigdl.nn.layer.
CAddTable
(inplace=False, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Merge the input tensors in the input table by element wise adding them together. The input table is actually an array of tensor with same size.
Parameters: inplace – reuse the input memory >>> cAddTable = CAddTable(True) creating: createCAddTable
-
class
bigdl.nn.layer.
CDivTable
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Takes a table with two Tensor and returns the component-wise division between them.
>>> cDivTable = CDivTable() creating: createCDivTable
-
class
bigdl.nn.layer.
CMaxTable
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Takes a table of Tensors and outputs the max of all of them.
>>> cMaxTable = CMaxTable() creating: createCMaxTable
-
class
bigdl.nn.layer.
CMinTable
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Takes a table of Tensors and outputs the min of all of them.
>>> cMinTable = CMinTable() creating: createCMinTable
-
class
bigdl.nn.layer.
CMul
(size, wRegularizer=None, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies a component-wise multiplication to the incoming data
Parameters: size – size of the data >>> cMul = CMul([1,2]) creating: createCMul
-
class
bigdl.nn.layer.
CMulTable
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Takes a table of Tensors and outputs the multiplication of all of them.
>>> cMulTable = CMulTable() creating: createCMulTable
-
class
bigdl.nn.layer.
CSubTable
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Takes a table with two Tensor and returns the component-wise subtraction between them.
>>> cSubTable = CSubTable() creating: createCSubTable
-
class
bigdl.nn.layer.
Clamp
(min, max, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Clamps all elements into the range [min_value, max_value]. Output is identical to input in the range, otherwise elements less than min_value (or greater than max_value) are saturated to min_value (or max_value).
Parameters: - min –
- max –
>>> clamp = Clamp(1, 3) creating: createClamp
-
class
bigdl.nn.layer.
Concat
(dimension, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Container
Concat concatenates the output of one layer of “parallel” modules along the provided {@code dimension}: they take the same inputs, and their output is concatenated.
+-----------+ +----> module1 -----+ | | | | input -----+----> module2 -----+----> output | | | | +----> module3 -----+ +-----------+
Parameters: dimension – dimension >>> concat = Concat(2) creating: createConcat
-
class
bigdl.nn.layer.
ConcatTable
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Container
ConcateTable is a container module like Concate. Applies an input to each member module, input can be a tensor or a table.
ConcateTable usually works with CAddTable and CMulTable to implement element wise add/multiply on outputs of two modules.
>>> concatTable = ConcatTable() creating: createConcatTable
-
class
bigdl.nn.layer.
Container
(jvalue, bigdl_type, *args)[source]¶ Bases:
bigdl.nn.layer.Layer
[[Container]] is a sub-class of Model that declares methods defined in all containers. A container usually contain some other modules which can be added through the “add” method
-
class
bigdl.nn.layer.
Contiguous
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
used to make input, grad_output both contiguous
>>> contiguous = Contiguous() creating: createContiguous
-
class
bigdl.nn.layer.
ConvLSTMPeephole
(input_size, output_size, kernel_i, kernel_c, stride, wRegularizer=None, uRegularizer=None, bRegularizer=None, with_peephole=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Convolution Long Short Term Memory architecture with peephole.Ref. A.: https://arxiv.org/abs/1506.04214 (blueprint for this module)Parameters: - input_size – number of input planes in the image given into forward()
- output_size – number of output planes the convolution layer will produce
:param kernel_i Convolutional filter size to convolve input :param kernel_c Convolutional filter size to convolve cell :param stride The step of the convolution :param wRegularizer: instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices :param uRegularizer: instance [[Regularizer]](eg. L1 or L2 regularization), applied to the recurrent weights matrices :param bRegularizer: instance of [[Regularizer]]applied to the bias. :param with_peephold: whether use last cell status control a gate.
>>> convlstm = ConvLSTMPeephole(4, 3, 3, 3, 1, L1Regularizer(0.5), L1Regularizer(0.5), L1Regularizer(0.5)) creating: createL1Regularizer creating: createL1Regularizer creating: createL1Regularizer creating: createConvLSTMPeephole
-
class
bigdl.nn.layer.
Cosine
(input_size, output_size, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Cosine calculates the cosine similarity of the input to k mean centers. The input given in forward(input) must be either a vector (1D tensor) or matrix (2D tensor). If the input is a vector, it must have the size of inputSize. If it is a matrix, then each row is assumed to be an input sample of given batch (the number of rows means the batch size and the number of columns should be equal to the inputSize).
Parameters: - input_size – the size of each input sample
- output_size – the size of the module output of each sample
>>> cosine = Cosine(2,3) creating: createCosine
-
class
bigdl.nn.layer.
CosineDistance
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Outputs the cosine distance between inputs
>>> cosineDistance = CosineDistance() creating: createCosineDistance
-
class
bigdl.nn.layer.
DotProduct
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
This is a simple table layer which takes a table of two tensors as input and calculate the dot product between them as outputs
>>> dotProduct = DotProduct() creating: createDotProduct
-
class
bigdl.nn.layer.
Dropout
(init_p=0.5, inplace=False, scale=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Dropout masks(set to zero) parts of input using a bernoulli distribution. Each input element has a probability initP of being dropped. If scale is set, the outputs are scaled by a factor of 1/(1-initP) during training. During evaluating, output is the same as input.
Parameters: - initP – probability to be dropped
- inplace – inplace model
- scale – if scale by a factor of 1/(1-initP)
>>> dropout = Dropout(0.4) creating: createDropout
-
class
bigdl.nn.layer.
ELU
(alpha=1.0, inplace=False, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
D-A Clevert, Thomas Unterthiner, Sepp Hochreiter Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) [http://arxiv.org/pdf/1511.07289.pdf]
>>> eLU = ELU(1e-5, True) creating: createELU
-
class
bigdl.nn.layer.
Echo
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
This module is for debug purpose, which can print activation and gradient in your model topology
>>> echo = Echo() creating: createEcho
-
class
bigdl.nn.layer.
Euclidean
(input_size, output_size, fast_backward=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Outputs the Euclidean distance of the input to outputSize centers
Parameters: - inputSize – inputSize
- outputSize – outputSize
- T – Numeric type. Only support float/double now
>>> euclidean = Euclidean(1, 1, True) creating: createEuclidean
-
class
bigdl.nn.layer.
Exp
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies element-wise exp to input tensor.
>>> exp = Exp() creating: createExp
-
class
bigdl.nn.layer.
FlattenTable
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
This is a table layer which takes an arbitrarily deep table of Tensors (potentially nested) as input and a table of Tensors without any nested table will be produced
>>> flattenTable = FlattenTable() creating: createFlattenTable
-
class
bigdl.nn.layer.
GRU
(input_size, hidden_size, p=0.0, wRegularizer=None, uRegularizer=None, bRegularizer=None, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Gated Recurrent Units architecture. The first input in sequence uses zero value for cell and hidden state
Ref.Parameters: - input_size – the size of each input vector
- hidden_size – Hidden unit size in GRU
- p – is used for [[Dropout]] probability. For more details aboutRNN dropouts, please refer to[RnnDrop: A Novel Dropout for RNNs in ASR](http://www.stat.berkeley.edu/~tsmoon/files/Conference/asru2015.pdf)[A Theoretically Grounded Application of Dropout in Recurrent Neural Networks](https://arxiv.org/pdf/1512.05287.pdf)
- wRegularizer – instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices.
- uRegularizer – instance [[Regularizer]](eg. L1 or L2 regularization), applied to the recurrent weights matrices.
- bRegularizer – instance of [[Regularizer]]applied to the bias.
>>> gru = GRU(4, 3, 0.5, L1Regularizer(0.5), L1Regularizer(0.5), L1Regularizer(0.5)) creating: createL1Regularizer creating: createL1Regularizer creating: createL1Regularizer creating: createGRU
-
class
bigdl.nn.layer.
GradientReversal
(the_lambda=1.0, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
It is a simple module preserves the input, but takes the gradient from the subsequent layer, multiplies it by -lambda and passes it to the preceding layer. This can be used to maximise an objective function whilst using gradient descent, as described in [“Domain-Adversarial Training of Neural Networks” (http://arxiv.org/abs/1505.07818)]
Parameters: lambda – hyper-parameter lambda can be set dynamically during training >>> gradientReversal = GradientReversal(1e-5) creating: createGradientReversal >>> gradientReversal = GradientReversal() creating: createGradientReversal
-
class
bigdl.nn.layer.
HardShrink
(the_lambda=0.5, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
This is a transfer layer which applies the hard shrinkage function element-wise to the input Tensor. The parameter lambda is set to 0.5 by default
x, if x > lambda f(x) = x, if x < -lambda 0, otherwise
Parameters: the_lambda – a threshold value whose default value is 0.5 >>> hardShrink = HardShrink(1e-5) creating: createHardShrink
-
class
bigdl.nn.layer.
HardTanh
(min_value=-1.0, max_value=1.0, inplace=False, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies HardTanh to each element of input, HardTanh is defined:
| maxValue, if x > maxValue f(x) = | minValue, if x < minValue | x, otherwise
Parameters: - min_value – minValue in f(x), default is -1.
- max_value – maxValue in f(x), default is 1.
- inplace – whether enable inplace model.
>>> hardTanh = HardTanh(1e-5, 1e5, True) creating: createHardTanh >>> hardTanh = HardTanh() creating: createHardTanh
-
class
bigdl.nn.layer.
Identity
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Identity just return the input to output. It’s useful in same parallel container to get an origin input.
>>> identity = Identity() creating: createIdentity
-
class
bigdl.nn.layer.
Index
(dimension, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies the Tensor index operation along the given dimension.
Parameters: dimension – the dimension to be indexed >>> index = Index(1) creating: createIndex
-
class
bigdl.nn.layer.
InferReshape
(size, batch_mode=False, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Reshape the input tensor with automatic size inference support. Positive numbers in the size argument are used to reshape the input to the corresponding dimension size. There are also two special values allowed in size: a. 0 means keep the corresponding dimension size of the input unchanged. i.e., if the 1st dimension size of the input is 2, the 1st dimension size of output will be set as 2 as well. b. -1 means infer this dimension size from other dimensions. This dimension size is calculated by keeping the amount of output elements consistent with the input. Only one -1 is allowable in size.
For example, Input tensor with size: (4, 5, 6, 7) -> InferReshape(Array(4, 0, 3, -1)) Output tensor with size: (4, 5, 3, 14) The 1st and 3rd dim are set to given sizes, keep the 2nd dim unchanged, and inferred the last dim as 14.
Parameters: - size – the target tensor size
- batch_mode – whether in batch mode
>>> inferReshape = InferReshape([4, 0, 3, -1], False) creating: createInferReshape
-
class
bigdl.nn.layer.
Input
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Node
Input layer do nothing to the input tensors, just passing them through. It is used as input to the Graph container (add a link) when the first layer of the graph container accepts multiple tensors as inputs.
Each input node of the graph container should accept one tensor as input. If you want a module accepting multiple tensors as input, you should add some Input module before it and connect the outputs of the Input nodes to it.
Please note that the return is not a layer but a Node containing input layer.
>>> input = Input() creating: createInput
-
class
bigdl.nn.layer.
JoinTable
(dimension, n_input_dims, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
It is a table module which takes a table of Tensors as input and outputs a Tensor by joining them together along the dimension dimension.
The input to this layer is expected to be a tensor, or a batch of tensors; when using mini-batch, a batch of sample tensors will be passed to the layer and the user need to specify the number of dimensions of each sample tensor in the batch using nInputDims.
Parameters: - dimension – to be join in this dimension
- nInputDims – specify the number of dimensions that this module will receiveIf it is more than the dimension of input tensors, the first dimensionwould be considered as batch size
>>> joinTable = JoinTable(1, 1) creating: createJoinTable
-
class
bigdl.nn.layer.
L1Penalty
(l1weight, size_average=False, provide_output=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
adds an L1 penalty to an input (for sparsity). L1Penalty is an inline module that in its forward propagation copies the input Tensor directly to the output, and computes an L1 loss of the latent state (input) and stores it in the module’s loss field. During backward propagation: gradInput = gradOutput + gradLoss.
Parameters: - l1weight –
- sizeAverage –
- provideOutput –
>>> l1Penalty = L1Penalty(1, True, True) creating: createL1Penalty
-
class
bigdl.nn.layer.
LSTM
(input_size, hidden_size, p=0.0, wRegularizer=None, uRegularizer=None, bRegularizer=None, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Long Short Term Memory architecture.Ref.A.: http://arxiv.org/pdf/1303.5778v1 (blueprint for this module)Parameters: - inputSize – the size of each input vector
- hiddenSize – Hidden unit size in the LSTM
- p – is used for [[Dropout]] probability. For more details aboutRNN dropouts, please refer to[RnnDrop: A Novel Dropout for RNNs in ASR](http://www.stat.berkeley.edu/~tsmoon/files/Conference/asru2015.pdf)[A Theoretically Grounded Application of Dropout in Recurrent Neural Networks](https://arxiv.org/pdf/1512.05287.pdf)
- wRegularizer – instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices.
- uRegularizer – instance [[Regularizer]](eg. L1 or L2 regularization), applied to the recurrent weights matrices.
- bRegularizer – instance of [[Regularizer]]applied to the bias.
>>> lstm = LSTM(4, 3, 0.5, L1Regularizer(0.5), L1Regularizer(0.5), L1Regularizer(0.5)) creating: createL1Regularizer creating: createL1Regularizer creating: createL1Regularizer creating: createLSTM
-
class
bigdl.nn.layer.
LSTMPeephole
(input_size=4, hidden_size=3, p=0.0, wRegularizer=None, uRegularizer=None, bRegularizer=None, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Long Short Term Memory architecture with peephole.Ref. A.: http://arxiv.org/pdf/1303.5778v1 (blueprint for this module)Parameters: - input_size – the size of each input vector
- hidden_size – Hidden unit size in the LSTM
- p – is used for [[Dropout]] probability. For more details aboutRNN dropouts, please refer to[RnnDrop: A Novel Dropout for RNNs in ASR](http://www.stat.berkeley.edu/~tsmoon/files/Conference/asru2015.pdf)[A Theoretically Grounded Application of Dropout in Recurrent Neural Networks](https://arxiv.org/pdf/1512.05287.pdf)
- wRegularizer – instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices.
- uRegularizer – instance [[Regularizer]](eg. L1 or L2 regularization), applied to the recurrent weights matrices.
- bRegularizer – instance of [[Regularizer]]applied to the bias.
>>> lstm = LSTMPeephole(4, 3, 0.5, L1Regularizer(0.5), L1Regularizer(0.5), L1Regularizer(0.5)) creating: createL1Regularizer creating: createL1Regularizer creating: createL1Regularizer creating: createLSTMPeephole
-
class
bigdl.nn.layer.
Layer
(jvalue, bigdl_type, *args)[source]¶ Bases:
bigdl.util.common.JavaValue
Layer is the basic component of a neural network and it’s also the base class of layers. Layer can connect to others to construct a complex neural network.
-
backward
(input, grad_output)[source]¶ NB: It’s for debug only, please use optimizer.optimize() in production. Performs a back-propagation step through the module, with respect to the given input. In general this method makes the assumption forward(input) has been called before, with the same input. This is necessary for optimization reasons. If you do not respect this rule, backward() will compute incorrect gradients.
Parameters: - input – ndarray or list of ndarray
- grad_output – ndarray or list of ndarray
Returns: ndarray or list of ndarray
-
static
check_input
(input)[source]¶ Parameters: input – ndarray or list of ndarray Returns: (list of JTensor, isTable)
-
forward
(input)[source]¶ NB: It’s for debug only, please use optimizer.optimize() in production. Takes an input object, and computes the corresponding output of the module
Parameters: input – ndarray or list of ndarray Returns: ndarray or list of ndarray
-
get_weights
()[source]¶ Get weights for this layer
Returns: list of numpy arrays which represent weight and bias
-
classmethod
of
(jvalue, bigdl_type='float')[source]¶ Create a Python Layer base on the given java value :param jvalue: Java object create by Py4j :return: A Python Layer
-
parameters
()[source]¶ Get the model parameters which containing: weight, bias, gradBias, gradWeight
Returns: dict(layername -> dict(parametername -> ndarray))
-
predict
(data_rdd)[source]¶ Model inference base on the given data. You need to invoke collect() to trigger those action as the returning result is an RDD.
Parameters: data_rdd – the data to be predict. Returns: An RDD represent the predict result.
-
predict_class
(data_rdd)[source]¶ module predict, return the predict label
Parameters: data_rdd – the data to be predict. Returns: An RDD represent the predict label.
-
save_tensorflow
(inputs, path, byte_order='little_endian', data_format='nhwc')[source]¶ Save a model to protobuf files so that it can be used in tensorflow inference.
When saving the model, placeholders will be added to the tf model as input nodes. So you need to pass in the names and shapes of the placeholders. BigDL model doesn’t have such information. The order of the placeholder information should be same as the inputs of the graph model. :param inputs: placeholder information, should be an array of tuples (input_name, shape) where ‘input_name’ is a string and shape is an array of integer :param path: the path to be saved to :param byte_order: model byte order :param data_format: model data format, should be “nhwc” or “nchw”
-
setBRegularizer
(bRegularizer)[source]¶ set bias regularizer :param wRegularizer: bias regularizer :return:
-
setWRegularizer
(wRegularizer)[source]¶ set weight regularizer :param wRegularizer: weight regularizer :return:
-
set_name
(name)[source]¶ Give this model a name. There would be a generated name consist of class name and UUID if user doesn’t set it.
-
set_seed
(seed=123)[source]¶ You can control the random seed which used to init weights for this model.
Parameters: seed – random seed Returns: Model itself.
-
set_weights
(weights)[source]¶ Set weights for this layer
Parameters: weights – a list of numpy arrays which represent weight and bias Returns: >>> linear = Linear(3,2) creating: createLinear >>> linear.set_weights([np.array([[1,2,3],[4,5,6]]), np.array([7,8])]) >>> weights = linear.get_weights() >>> weights[0].shape == (2,3) True >>> weights[0][0] array([ 1., 2., 3.], dtype=float32) >>> weights[1] array([ 7., 8.], dtype=float32) >>> relu = ReLU() creating: createReLU >>> from py4j.protocol import Py4JJavaError >>> try: ... relu.set_weights([np.array([[1,2,3],[4,5,6]]), np.array([7,8])]) ... except Py4JJavaError as err: ... print(err.java_exception) ... java.lang.IllegalArgumentException: requirement failed: this layer does not have weight/bias >>> relu.get_weights() The layer does not have weight/bias >>> add = Add(2) creating: createAdd >>> try: ... add.set_weights([np.array([7,8]), np.array([1,2])]) ... except Py4JJavaError as err: ... print(err.java_exception) ... java.lang.IllegalArgumentException: requirement failed: the number of input weight/bias is not consistant with number of weight/bias of this layer >>> cAdd = CAdd([4, 1]) creating: createCAdd >>> cAdd.set_weights(np.ones([4, 1])) >>> (cAdd.get_weights()[0] == np.ones([4, 1])).all() True
-
test
(val_rdd, batch_size, val_methods)[source]¶ A method to benchmark the model quality.
Parameters: - val_rdd – the input data
- batch_size – batch size
- val_methods – a list of validation methods. i.e: Top1Accuracy,Top5Accuracy and Loss.
Returns:
-
-
class
bigdl.nn.layer.
LeakyReLU
(negval=0.01, inplace=False, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
It is a transfer module that applies LeakyReLU, which parameter negval sets the slope of the negative part: LeakyReLU is defined as: f(x) = max(0, x) + negval * min(0, x)
Parameters: - negval – sets the slope of the negative partl
- inplace – if it is true, doing the operation in-place without using extra state memory
>>> leakyReLU = LeakyReLU(1e-5, True) creating: createLeakyReLU
-
class
bigdl.nn.layer.
Linear
(input_size, output_size, with_bias=True, wRegularizer=None, bRegularizer=None, init_weight=None, init_bias=None, init_grad_weight=None, init_grad_bias=None, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
The [[Linear]] module applies a linear transformation to the input data, i.e. y = Wx + b. The input given in forward(input) must be either a vector (1D tensor) or matrix (2D tensor). If the input is a vector, it must have the size of inputSize. If it is a matrix, then each row is assumed to be an input sample of given batch (the number of rows means the batch size and the number of columns should be equal to the inputSize).
:param input_size the size the each input sample :param output_size the size of the module output of each sample :param wRegularizer: instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices. :param bRegularizer: instance of [[Regularizer]]applied to the bias. :param init_weight: the optional initial value for the weight :param init_bias: the optional initial value for the bias :param init_grad_weight: the optional initial value for the grad_weight :param init_grad_bias: the optional initial value for the grad_bias
>>> linear = Linear(100, 10, True, L1Regularizer(0.5), L1Regularizer(0.5)) creating: createL1Regularizer creating: createL1Regularizer creating: createLinear >>> import numpy as np >>> init_weight = np.random.randn(10, 100) >>> init_bias = np.random.randn(10) >>> init_grad_weight = np.zeros([10, 100]) >>> init_grad_bias = np.zeros([10]) >>> linear = Linear(100, 10, True, L1Regularizer(0.5), L1Regularizer(0.5), init_weight, init_bias, init_grad_weight, init_grad_bias) creating: createL1Regularizer creating: createL1Regularizer creating: createLinear
-
class
bigdl.nn.layer.
Log
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies the log function element-wise to the input Tensor, thus outputting a Tensor of the same dimension.
>>> log = Log() creating: createLog
-
class
bigdl.nn.layer.
LogSigmoid
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
This class is a transform layer corresponding to the sigmoid function: f(x) = Log(1 / (1 + e ^^ (-x)))
>>> logSigmoid = LogSigmoid() creating: createLogSigmoid
-
class
bigdl.nn.layer.
LogSoftMax
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies the LogSoftMax function to an n-dimensional input Tensor. LogSoftmax is defined as: f_i(x) = log(1 / a exp(x_i)) where a = sum_j[exp(x_j)].
>>> logSoftMax = LogSoftMax() creating: createLogSoftMax
-
class
bigdl.nn.layer.
LookupTable
(n_index, n_output, padding_value=0.0, max_norm=1.7976931348623157e+308, norm_type=2.0, should_scale_grad_by_freq=False, wRegularizer=None, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
a convolution of width 1, commonly used for word embeddings
Parameters: wRegularizer – instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices. >>> lookupTable = LookupTable(1, 1, 1e-5, 1e-5, 1e-5, True, L1Regularizer(0.5)) creating: createL1Regularizer creating: createLookupTable
-
class
bigdl.nn.layer.
MM
(trans_a=False, trans_b=False, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Module to perform matrix multiplication on two mini-batch inputs, producing a mini-batch.
Parameters: - trans_a – specifying whether or not transpose the first input matrix
- trans_b – specifying whether or not transpose the second input matrix
>>> mM = MM(True, True) creating: createMM
-
class
bigdl.nn.layer.
MV
(trans=False, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
It is a module to perform matrix vector multiplication on two mini-batch inputs, producing a mini-batch.
Parameters: trans – whether make matrix transpose before multiplication >>> mV = MV(True) creating: createMV
-
class
bigdl.nn.layer.
MapTable
(module=None, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Container
This class is a container for a single module which will be applied to all input elements. The member module is cloned as necessary to process all input elements.
>>> mapTable = MapTable(Linear(100,10)) creating: createLinear creating: createMapTable
-
class
bigdl.nn.layer.
MaskedSelect
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Performs a torch.MaskedSelect on a Tensor. The mask is supplied as a tabular argument with the input on the forward and backward passes.
>>> maskedSelect = MaskedSelect() creating: createMaskedSelect
-
class
bigdl.nn.layer.
Max
(dim, num_input_dims=-2147483648, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies a max operation over dimension dim
Parameters: - dim – max along this dimension
- num_input_dims – Optional. If in a batch model, set to the inputDims.
>>> max = Max(1) creating: createMax
-
class
bigdl.nn.layer.
Mean
(dimension=1, n_input_dims=-1, squeeze=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
It is a simple layer which applies a mean operation over the given dimension. When nInputDims is provided, the input will be considered as batches. Then the mean operation will be applied in (dimension + 1). The input to this layer is expected to be a tensor, or a batch of tensors; when using mini-batch, a batch of sample tensors will be passed to the layer and the user need to specify the number of dimensions of each sample tensor in the batch using nInputDims.
Parameters: - dimension – the dimension to be applied mean operation
- n_input_dims – specify the number of dimensions that this module will receiveIf it is more than the dimension of input tensors, the first dimension would be consideredas batch size
- squeeze – default is true, which will squeeze the sum dimension; set it to false to keep the sum dimension
>>> mean = Mean(1, 1, True) creating: createMean
-
class
bigdl.nn.layer.
Min
(dim=1, num_input_dims=-2147483648, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies a min operation over dimension dim.
Parameters: - dim – min along this dimension
- num_input_dims – Optional. If in a batch model, set to the input_dim.
>>> min = Min(1) creating: createMin
-
class
bigdl.nn.layer.
MixtureTable
(dim=2147483647, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Creates a module that takes a table {gater, experts} as input and outputs the mixture of experts (a Tensor or table of Tensors) using a gater Tensor. When dim is provided, it specifies the dimension of the experts Tensor that will be interpolated (or mixed). Otherwise, the experts should take the form of a table of Tensors. This Module works for experts of dimension 1D or more, and for a 1D or 2D gater, i.e. for single examples or mini-batches.
>>> mixtureTable = MixtureTable() creating: createMixtureTable >>> mixtureTable = MixtureTable(10) creating: createMixtureTable
-
class
bigdl.nn.layer.
Model
(inputs, outputs, bigdl_type='float', byte_order='little_endian', model_type='bigdl')[source]¶ Bases:
bigdl.nn.layer.Container
A graph container. Each node can have multiple inputs. The output of the node should be a tensor. The output tensor can be connected to multiple nodes. So the module in each node can have a tensor or table input, and should have a tensor output.
The graph container can have multiple inputs and multiple outputs. If there’s one input, the input data fed to the graph module should be a tensor. If there’re multiple inputs, the input data fed to the graph module should be a table, which is actually an sequence of tensor. The order of the input tensors should be same with the order of the input nodes. This is also applied to the gradient from the module in the back propagation.
If there’s one output, the module output is a tensor. If there’re multiple outputs, the module output is a table, which is actually an sequence of tensor. The order of the output tensors is same with the order of the output modules. This is also applied to the gradient passed to the module in the back propagation.
All inputs should be able to connect to outputs through some paths in the graph. It is allowed that some successors of the inputs node are not connect to outputs. If so, these nodes will be excluded in the computation.
We also support initializing a Graph directly from a tensorflow module. In this case, you should pass your tensorflow nodes as inputs and outputs and also specify the byte_order parameter (“little_endian” or “big_endian”) and node_type parameter (“bigdl” or “tensorflow”) node_type parameter.
-
static
load
(path, bigdl_type='float')[source]¶ Load a pre-trained Bigdl model.
Parameters: path – The path containing the pre-trained model. Returns: A pre-trained model.
-
static
load_caffe
(model, defPath, modelPath, match_all=True, bigdl_type='float')[source]¶ Load a pre-trained Caffe model.
Parameters: - model – A bigdl model definition which equivalent to the pre-trained caffe model.
- defPath – The path containing the caffe model definition.
- modelPath – The path containing the pre-trained caffe model.
Returns: A pre-trained model.
-
static
load_caffe_model
(defPath, modelPath, bigdl_type='float')[source]¶ Load a pre-trained Caffe model.
Parameters: - defPath – The path containing the caffe model definition.
- modelPath – The path containing the pre-trained caffe model.
Returns: A pre-trained model.
-
static
-
class
bigdl.nn.layer.
Mul
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Multiply a single scalar factor to the incoming data
>>> mul = Mul() creating: createMul
-
class
bigdl.nn.layer.
MulConstant
(scalar, inplace=False, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Multiplies input Tensor by a (non-learnable) scalar constant. This module is sometimes useful for debugging purposes.
Parameters: - scalar – scalar constant
- inplace – Can optionally do its operation in-place without using extra state memory
>>> mulConstant = MulConstant(2.5) creating: createMulConstant
-
class
bigdl.nn.layer.
Narrow
(dimension, offset, length=1, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Narrow is application of narrow operation in a module. The module further supports a negative length in order to handle inputs with an unknown size.
>>> narrow = Narrow(1, 1, 1) creating: createNarrow
-
class
bigdl.nn.layer.
NarrowTable
(offset, length=1, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Creates a module that takes a table as input and outputs the subtable starting at index offset having length elements (defaults to 1 element). The elements can be either a table or a Tensor. If length is negative, it means selecting the elements from the offset to element which located at the abs(length) to the last element of the input.
Parameters: - offset – the start index of table
- length – the length want to select
>>> narrowTable = NarrowTable(1, 1) creating: createNarrowTable
-
class
bigdl.nn.layer.
Node
(jvalue, bigdl_type, *args)[source]¶ Bases:
bigdl.util.common.JavaValue
Represent a node in a graph. The connections between nodes are directed.
-
class
bigdl.nn.layer.
Normalize
(p, eps=1e-10, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Normalizes the input Tensor to have unit L_p norm. The smoothing parameter eps prevents division by zero when the input contains all zero elements (default = 1e-10). p can be the max value of double
>>> normalize = Normalize(1e-5, 1e-5) creating: createNormalize
-
class
bigdl.nn.layer.
PReLU
(n_output_plane=0, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies parametric ReLU, which parameter varies the slope of the negative part.
PReLU: f(x) = max(0, x) + a * min(0, x)
nOutputPlane’s default value is 0, that means using PReLU in shared version and has only one parameters.
Notice: Please don’t use weight decay on this.
Parameters: n_output_plane – input map number. Default is 0. >>> pReLU = PReLU(1) creating: createPReLU
-
class
bigdl.nn.layer.
Pack
(dimension, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Stacks a list of n-dimensional tensors into one (n+1)-dimensional tensor.
>>> layer = Pack(1) creating: createPack
-
class
bigdl.nn.layer.
Padding
(dim, pad, n_input_dim, value=0.0, n_index=1, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
This module adds pad units of padding to dimension dim of the input. If pad is negative, padding is added to the left, otherwise, it is added to the right of the dimension.
The input to this layer is expected to be a tensor, or a batch of tensors; when using mini-batch, a batch of sample tensors will be passed to the layer and the user need to specify the number of dimensions of each sample tensor in the batch using n_input_dim.
Parameters: - dim – the dimension to be applied padding operation
- pad – num of the pad units
- n_input_dim – specify the number of dimensions that this module will receiveIf it is more than the dimension of input tensors, the first dimensionwould be considered as batch size
- value – padding value
>>> padding = Padding(1, 1, 1, 1e-5, 1) creating: createPadding
-
class
bigdl.nn.layer.
PairwiseDistance
(norm=2, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
It is a module that takes a table of two vectors as input and outputs the distance between them using the p-norm. The input given in forward(input) is a [[Table]] that contains two tensors which must be either a vector (1D tensor) or matrix (2D tensor). If the input is a vector, it must have the size of inputSize. If it is a matrix, then each row is assumed to be an input sample of the given batch (the number of rows means the batch size and the number of columns should be equal to the inputSize).
Parameters: norm – the norm of distance >>> pairwiseDistance = PairwiseDistance(2) creating: createPairwiseDistance
-
class
bigdl.nn.layer.
ParallelTable
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Container
It is a container module that applies the i-th member module to the i-th input, and outputs an output in the form of Table
>>> parallelTable = ParallelTable() creating: createParallelTable
-
class
bigdl.nn.layer.
Power
(power, scale=1.0, shift=0.0, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Apply an element-wise power operation with scale and shift. f(x) = (shift + scale * x)^power^
Parameters: - power – the exponent.
- scale – Default is 1.
- shift – Default is 0.
>>> power = Power(1e-5) creating: createPower
-
class
bigdl.nn.layer.
RReLU
(lower=0.125, upper=0.3333333333333333, inplace=False, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies the randomized leaky rectified linear unit (RReLU) element-wise to the input Tensor, thus outputting a Tensor of the same dimension. Informally the RReLU is also known as ‘insanity’ layer. RReLU is defined as:
f(x) = max(0,x) + a * min(0, x) where a ~ U(l, u).
In training mode negative inputs are multiplied by a factor drawn from a uniform random distribution U(l, u).
In evaluation mode a RReLU behaves like a LeakyReLU with a constant mean factor a = (l + u) / 2.
By default, l = 1/8 and u = 1/3. If l == u a RReLU effectively becomes a LeakyReLU.
Regardless of operating in in-place mode a RReLU will internally allocate an input-sized noise tensor to store random factors for negative inputs.
The backward() operation assumes that forward() has been called before.
For reference see [Empirical Evaluation of Rectified Activations in Convolutional Network]( http://arxiv.org/abs/1505.00853).
Parameters: - lower – lower boundary of uniform random distribution
- upper – upper boundary of uniform random distribution
- inplace – optionally do its operation in-place without using extra state memory
>>> rReLU = RReLU(1e-5, 1e5, True) creating: createRReLU
-
class
bigdl.nn.layer.
ReLU
(ip=False, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies the rectified linear unit (ReLU) function element-wise to the input Tensor, thus outputting a Tensor of the same dimension.
ReLU is defined as: f(x) = max(0, x) Can optionally do its operation in-place without using extra state memory
>>> relu = ReLU() creating: createReLU
-
class
bigdl.nn.layer.
ReLU6
(inplace=False, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Same as ReLU except that the rectifying function f(x) saturates at x = 6
Parameters: inplace – either True = in-place or False = keeping separate state >>> reLU6 = ReLU6(True) creating: createReLU6
-
class
bigdl.nn.layer.
Recurrent
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Container
Recurrent module is a container of rnn cells Different types of rnn cells can be added using add() function
>>> recurrent = Recurrent() creating: createRecurrent
-
class
bigdl.nn.layer.
Replicate
(n_features, dim=1, n_dim=2147483647, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Replicate repeats input nFeatures times along its dim dimension. Notice: No memory copy, it set the stride along the dim-th dimension to zero.
Parameters: - n_features – replicate times.
- dim – dimension to be replicated.
- n_dim – specify the number of non-batch dimensions.
>>> replicate = Replicate(2) creating: createReplicate
-
class
bigdl.nn.layer.
Reshape
(size, batch_mode=None, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
The forward(input) reshape the input tensor into a size(0) * size(1) * … tensor, taking the elements row-wise.
Parameters: size – the reshape size >>> reshape = Reshape([1, 28, 28]) creating: createReshape >>> reshape = Reshape([1, 28, 28], False) creating: createReshape
-
class
bigdl.nn.layer.
Reverse
(dimension=1, is_inplace=False, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Reverse the input w.r.t given dimension. The input can be a Tensor or Table.
Parameters: dim – >>> reverse = Reverse() creating: createReverse >>> reverse = Reverse(1, False) creating: createReverse
-
class
bigdl.nn.layer.
RnnCell
(input_size, hidden_size, activation, wRegularizer=None, uRegularizer=None, bRegularizer=None, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
It is a simple RNN. User can pass an activation function to the RNN.
Parameters: - input_size – the size of each input vector
- hidden_size – Hidden unit size in simple RNN
- activation – activation function
- wRegularizer – instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices.
- uRegularizer – instance [[Regularizer]](eg. L1 or L2 regularization), applied to the recurrent weights matrices.
- bRegularizer – instance of [[Regularizer]](../regularizers.md),applied to the bias.
>>> reshape = RnnCell(4, 3, Tanh(), L1Regularizer(0.5), L1Regularizer(0.5), L1Regularizer(0.5)) creating: createTanh creating: createL1Regularizer creating: createL1Regularizer creating: createL1Regularizer creating: createRnnCell
-
class
bigdl.nn.layer.
RoiPooling
(pooled_w, pooled_h, spatial_scale, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Region of interest pooling The RoIPooling uses max pooling to convert the features inside any valid region of interest into a small feature map with a fixed spatial extent of pooledH * pooledW (e.g., 7 * 7) an RoI is a rectangular window into a conv feature map. Each RoI is defined by a four-tuple (x1, y1, x2, y2) that specifies its top-left corner (x1, y1) and its bottom-right corner (x2, y2). RoI max pooling works by dividing the h * w RoI window into an pooledH * pooledW grid of sub-windows of approximate size h/H * w/W and then max-pooling the values in each sub-window into the corresponding output grid cell. Pooling is applied independently to each feature map channel
Parameters: - pooled_w – spatial extent in width
- pooled_h – spatial extent in height
- spatial_scale – spatial scale
>>> import numpy as np >>> input_data = np.random.rand(2,2,6,8) >>> input_rois = np.array([0, 0, 0, 7, 5, 1, 6, 2, 7, 5, 1, 3, 1, 6, 4, 0, 3, 3, 3, 3],dtype='float64').reshape(4,5) >>> m = RoiPooling(3,2,1.0) creating: createRoiPooling >>> out = m.forward([input_data,input_rois])
-
class
bigdl.nn.layer.
Scale
(size, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Scale is the combination of CMul and CAdd Computes the elementwise product of input and weight, with the shape of the weight “expand” to match the shape of the input. Similarly, perform a expand cdd bias and perform an elementwise add
Parameters: size – size of weight and bias >>> scale = Scale([1,2]) creating: createScale
-
class
bigdl.nn.layer.
Select
(dim, index, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
A Simple layer selecting an index of the input tensor in the given dimension
Parameters: - dimension – the dimension to select
- index – the index of the dimension to be selected
>>> select = Select(1, 1) creating: createSelect
-
class
bigdl.nn.layer.
SelectTable
(index, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Creates a module that takes a table as input and outputs the element at index index (positive or negative). This can be either a table or a Tensor. The gradients of the non-index elements are zeroed Tensors of the same size. This is true regardless of the depth of the encapsulated Tensor as the function used internally to do so is recursive.
Parameters: index – the index to be selected >>> selectTable = SelectTable(1) creating: createSelectTable
-
class
bigdl.nn.layer.
Sequential
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Container
Sequential provides a means to plug layers together in a feed-forward fully connected manner.
>>> echo = Echo() creating: createEcho >>> s = Sequential() creating: createSequential >>> s = s.add(echo) >>> s = s.add(s) >>> s = s.add(echo)
-
class
bigdl.nn.layer.
Sigmoid
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies the Sigmoid function element-wise to the input Tensor, thus outputting a Tensor of the same dimension.
>>> sigmoid = Sigmoid() creating: createSigmoid
-
class
bigdl.nn.layer.
SoftMax
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies the SoftMax function to an n-dimensional input Tensor, rescaling them so that the elements of the n-dimensional output Tensor lie in the range (0, 1) and sum to 1. Softmax is defined as: f_i(x) = exp(x_i - shift) / sum_j exp(x_j - shift) where shift = max_i(x_i).
>>> softMax = SoftMax() creating: createSoftMax
-
class
bigdl.nn.layer.
SoftMin
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies the SoftMin function to an n-dimensional input Tensor, rescaling them so that the elements of the n-dimensional output Tensor lie in the range (0,1) and sum to 1. Softmin is defined as: f_i(x) = exp(-x_i - shift) / sum_j exp(-x_j - shift) where shift = max_i(-x_i).
>>> softMin = SoftMin() creating: createSoftMin
-
class
bigdl.nn.layer.
SoftPlus
(beta=1.0, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Apply the SoftPlus function to an n-dimensional input tensor. SoftPlus function: f_i(x) = 1/beta * log(1 + exp(beta * x_i))
Parameters: beta – Controls sharpness of transfer function >>> softPlus = SoftPlus(1e-5) creating: createSoftPlus
-
class
bigdl.nn.layer.
SoftShrink
(the_lambda=0.5, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Apply the soft shrinkage function element-wise to the input Tensor
SoftShrinkage operator:
| x - lambda, if x > lambda f(x) = | x + lambda, if x < -lambda | 0, otherwise
Parameters: the_lambda – lambda, default is 0.5 >>> softShrink = SoftShrink(1e-5) creating: createSoftShrink
-
class
bigdl.nn.layer.
SoftSign
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Apply SoftSign function to an n-dimensional input Tensor.
SoftSign function: f_i(x) = x_i / (1+|x_i|)
>>> softSign = SoftSign() creating: createSoftSign
-
class
bigdl.nn.layer.
SpatialAveragePooling
(kw, kh, dw=1, dh=1, pad_w=0, pad_h=0, ceil_mode=False, count_include_pad=True, divide=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies 2D average-pooling operation in kWxkH regions by step size dWxdH steps. The number of output features is equal to the number of input planes.
Parameters: - kW – kernel width
- kH – kernel height
- dW – step width
- dH – step height
- padW – padding width
- padH – padding height
- ceilMode – whether the output size is to be ceiled or floored
- countIncludePad – whether to include padding when dividing thenumber of elements in pooling region
- divide – whether to do the averaging
>>> spatialAveragePooling = SpatialAveragePooling(7,7) creating: createSpatialAveragePooling
-
class
bigdl.nn.layer.
SpatialBatchNormalization
(n_output, eps=1e-05, momentum=0.1, affine=True, init_weight=None, init_bias=None, init_grad_weight=None, init_grad_bias=None, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
This file implements Batch Normalization as described in the paper: “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift” by Sergey Ioffe, Christian Szegedy This implementation is useful for inputs coming from convolution layers. For non-convolutional layers, see [[BatchNormalization]] The operation implemented is:
( x - mean(x) ) y = -------------------- * gamma + beta standard-deviation(x)
where gamma and beta are learnable parameters. The learning of gamma and beta is optional.
>>> spatialBatchNormalization = SpatialBatchNormalization(1) creating: createSpatialBatchNormalization >>> import numpy as np >>> init_weight = np.array([1.0]) >>> init_grad_weight = np.array([0.0]) >>> init_bias = np.array([0.0]) >>> init_grad_bias = np.array([0.0]) >>> spatialBatchNormalization = SpatialBatchNormalization(1, 1e-5, 0.1, True, init_weight, init_bias, init_grad_weight, init_grad_bias) creating: createSpatialBatchNormalization
-
class
bigdl.nn.layer.
SpatialContrastiveNormalization
(n_input_plane=1, kernel=None, threshold=0.0001, thresval=0.0001, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Subtractive + divisive contrast normalization.
Parameters: - n_input_plane –
- kernel –
- threshold –
- thresval –
>>> kernel = np.ones([9,9]).astype("float32") >>> spatialContrastiveNormalization = SpatialContrastiveNormalization(1, kernel) creating: createSpatialContrastiveNormalization >>> spatialContrastiveNormalization = SpatialContrastiveNormalization() creating: createSpatialContrastiveNormalization
-
class
bigdl.nn.layer.
SpatialConvolution
(n_input_plane, n_output_plane, kernel_w, kernel_h, stride_w=1, stride_h=1, pad_w=0, pad_h=0, n_group=1, propagate_back=True, wRegularizer=None, bRegularizer=None, init_weight=None, init_bias=None, init_grad_weight=None, init_grad_bias=None, with_bias=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies a 2D convolution over an input image composed of several input planes. The input tensor in forward(input) is expected to be a 3D tensor (nInputPlane x height x width).
:param n_input_plane The number of expected input planes in the image given into forward() :param n_output_plane The number of output planes the convolution layer will produce. :param kernel_w The kernel width of the convolution :param kernel_h The kernel height of the convolution :param stride_w The step of the convolution in the width dimension. :param stride_h The step of the convolution in the height dimension :param pad_w The additional zeros added per width to the input planes. :param pad_h The additional zeros added per height to the input planes. :param n_group Kernel group number :param propagate_back Propagate gradient back :param wRegularizer: instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices. :param bRegularizer: instance of [[Regularizer]]applied to the bias. :param init_weight: the optional initial value for the weight :param init_bias: the optional initial value for the bias :param init_grad_weight: the optional initial value for the grad_weight :param init_grad_bias: the optional initial value for the grad_bias :param with_bias: the optional initial value for if need bias
>>> spatialConvolution = SpatialConvolution(6, 12, 5, 5) creating: createSpatialConvolution >>> spatialConvolution.setWRegularizer(L1Regularizer(0.5)) creating: createL1Regularizer >>> spatialConvolution.setBRegularizer(L1Regularizer(0.5)) creating: createL1Regularizer >>> import numpy as np >>> init_weight = np.random.randn(1, 12, 6, 5, 5) >>> init_bias = np.random.randn(12) >>> init_grad_weight = np.zeros([1, 12, 6, 5, 5]) >>> init_grad_bias = np.zeros([12]) >>> spatialConvolution = SpatialConvolution(6, 12, 5, 5, 1, 1, 0, 0, 1, True, L1Regularizer(0.5), L1Regularizer(0.5), init_weight, init_bias, init_grad_weight, init_grad_bias) creating: createL1Regularizer creating: createL1Regularizer creating: createSpatialConvolution
-
class
bigdl.nn.layer.
SpatialConvolutionMap
(conn_table, kw, kh, dw=1, dh=1, pad_w=0, pad_h=0, wRegularizer=None, bRegularizer=None, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
This class is a generalization of SpatialConvolution. It uses a generic connection table between input and output features. The SpatialConvolution is equivalent to using a full connection table.
Parameters: - wRegularizer – instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices.
- bRegularizer – instance of [[Regularizer]]applied to the bias.
>>> ct = np.ones([9,9]).astype("float32") >>> spatialConvolutionMap = SpatialConvolutionMap(ct, 9, 9) creating: createSpatialConvolutionMap
-
class
bigdl.nn.layer.
SpatialCrossMapLRN
(size=5, alpha=1.0, beta=0.75, k=1.0, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies Spatial Local Response Normalization between different feature maps. The operation implemented is:
x_f y_f = ------------------------------------------------- (k+(alpha/size)* sum_{l=l1 to l2} (x_l^2^))^beta^
where x_f is the input at spatial locations h,w (not shown for simplicity) and feature map f, l1 corresponds to max(0,f-ceil(size/2)) and l2 to min(F, f-ceil(size/2) + size). Here, F is the number of feature maps.
Parameters: - size – the number of channels to sum over (for cross channel LRN) or the side length ofthe square region to sum over (for within channel LRN)
- alpha – the scaling parameter
- beta – the exponent
- k – a constant
>>> spatialCrossMapLRN = SpatialCrossMapLRN() creating: createSpatialCrossMapLRN
-
class
bigdl.nn.layer.
SpatialDilatedConvolution
(n_input_plane, n_output_plane, kw, kh, dw=1, dh=1, pad_w=0, pad_h=0, dilation_w=1, dilation_h=1, wRegularizer=None, bRegularizer=None, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Apply a 2D dilated convolution over an input image.
The input tensor is expected to be a 3D or 4D(with batch) tensor.
If input is a 3D tensor nInputPlane x height x width, owidth = floor(width + 2 * padW - dilationW * (kW-1) - 1) / dW + 1 oheight = floor(height + 2 * padH - dilationH * (kH-1) - 1) / dH + 1
Reference Paper: Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions[J]. arXiv preprint arXiv:1511.07122, 2015.
Parameters: - n_input_plane – The number of expected input planes in the image given into forward().
- n_output_plane – The number of output planes the convolution layer will produce.
- kw – The kernel width of the convolution.
- kh – The kernel height of the convolution.
- dw – The step of the convolution in the width dimension. Default is 1.
- dh – The step of the convolution in the height dimension. Default is 1.
- pad_w – The additional zeros added per width to the input planes. Default is 0.
- pad_h – The additional zeros added per height to the input planes. Default is 0.
- dilation_w – The number of pixels to skip. Default is 1.
- dilation_h – The number of pixels to skip. Default is 1.
- init_method – Init method, Default, Xavier.
- wRegularizer – instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices.
- bRegularizer – instance of [[Regularizer]]applied to the bias.
>>> spatialDilatedConvolution = SpatialDilatedConvolution(1, 1, 1, 1) creating: createSpatialDilatedConvolution
-
class
bigdl.nn.layer.
SpatialDivisiveNormalization
(n_input_plane=1, kernel=None, threshold=0.0001, thresval=0.0001, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies a spatial division operation on a series of 2D inputs using kernel for computing the weighted average in a neighborhood. The neighborhood is defined for a local spatial region that is the size as kernel and across all features. For an input image, since there is only one feature, the region is only spatial. For an RGB image, the weighted average is taken over RGB channels and a spatial region.
If the kernel is 1D, then it will be used for constructing and separable 2D kernel. The operations will be much more efficient in this case.
The kernel is generally chosen as a gaussian when it is believed that the correlation of two pixel locations decrease with increasing distance. On the feature dimension, a uniform average is used since the weighting across features is not known.
Parameters: - nInputPlane – number of input plane, default is 1.
- kernel – kernel tensor, default is a 9 x 9 tensor.
- threshold – threshold
- thresval – threshhold value to replace withif data is smaller than theshold
>>> kernel = np.ones([9,9]).astype("float32") >>> spatialDivisiveNormalization = SpatialDivisiveNormalization(2,kernel) creating: createSpatialDivisiveNormalization >>> spatialDivisiveNormalization = SpatialDivisiveNormalization() creating: createSpatialDivisiveNormalization
-
class
bigdl.nn.layer.
SpatialFullConvolution
(n_input_plane, n_output_plane, kw, kh, dw=1, dh=1, pad_w=0, pad_h=0, adj_w=0, adj_h=0, n_group=1, no_bias=False, wRegularizer=None, bRegularizer=None, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Apply a 2D full convolution over an input image. The input tensor is expected to be a 3D or 4D(with batch) tensor. Note that instead of setting adjW and adjH, SpatialFullConvolution[Table, T] also accepts a table input with two tensors: T(convInput, sizeTensor) where convInput is the standard input tensor, and the size of sizeTensor is used to set the size of the output (will ignore the adjW and adjH values used to construct the module). This module can be used without a bias by setting parameter noBias = true while constructing the module.
If input is a 3D tensor nInputPlane x height x width, owidth = (width - 1) * dW - 2*padW + kW + adjW oheight = (height - 1) * dH - 2*padH + kH + adjH
Other frameworks call this operation “In-network Upsampling”, “Fractionally-strided convolution”, “Backwards Convolution,” “Deconvolution”, or “Upconvolution.”
Reference Paper: Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 3431-3440.
:param nInputPlane The number of expected input planes in the image given into forward() :param nOutputPlane The number of output planes the convolution layer will produce. :param kW The kernel width of the convolution. :param kH The kernel height of the convolution. :param dW The step of the convolution in the width dimension. Default is 1. :param dH The step of the convolution in the height dimension. Default is 1. :param padW The additional zeros added per width to the input planes. Default is 0. :param padH The additional zeros added per height to the input planes. Default is 0. :param adjW Extra width to add to the output image. Default is 0. :param adjH Extra height to add to the output image. Default is 0. :param nGroup Kernel group number. :param noBias If bias is needed. :param initMethod Init method, Default, Xavier, Bilinear. :param wRegularizer: instance of [[Regularizer]](eg. L1 or L2 regularization), applied to the input weights matrices. :param bRegularizer: instance of [[Regularizer]]applied to the bias.
>>> spatialFullConvolution = SpatialFullConvolution(1, 1, 1, 1) creating: createSpatialFullConvolution
-
class
bigdl.nn.layer.
SpatialMaxPooling
(kw, kh, dw, dh, pad_w=0, pad_h=0, to_ceil=False, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies 2D max-pooling operation in kWxkH regions by step size dWxdH steps. The number of output features is equal to the number of input planes. If the input image is a 3D tensor nInputPlane x height x width, the output image size will be nOutputPlane x oheight x owidth where owidth = op((width + 2*padW - kW) / dW + 1) oheight = op((height + 2*padH - kH) / dH + 1) op is a rounding operator. By default, it is floor. It can be changed by calling :ceil() or :floor() methods.
Parameters: - kW – kernel width
- kH – kernel height
- dW – step size in width
- dH – step size in height
- padW – padding in width
- padH – padding in height
>>> spatialMaxPooling = SpatialMaxPooling(2, 2, 2, 2) creating: createSpatialMaxPooling
Bases:
bigdl.nn.layer.Layer
>>> spatialShareConvolution = SpatialShareConvolution(1, 1, 1, 1) creating: createSpatialShareConvolution >>> import numpy as np >>> init_weight = np.random.randn(1, 12, 6, 5, 5) >>> init_bias = np.random.randn(12) >>> init_grad_weight = np.zeros([1, 12, 6, 5, 5]) >>> init_grad_bias = np.zeros([12]) >>> conv = SpatialShareConvolution(6, 12, 5, 5, 1, 1, 0, 0, 1, True, L1Regularizer(0.5), L1Regularizer(0.5), init_weight, init_bias, init_grad_weight, init_grad_bias) creating: createL1Regularizer creating: createL1Regularizer creating: createSpatialShareConvolution
-
class
bigdl.nn.layer.
SpatialSubtractiveNormalization
(n_input_plane=1, kernel=None, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies a spatial subtraction operation on a series of 2D inputs using kernel for computing the weighted average in a neighborhood. The neighborhood is defined for a local spatial region that is the size as kernel and across all features. For a an input image, since there is only one feature, the region is only spatial. For an RGB image, the weighted average is taken over RGB channels and a spatial region.
If the kernel is 1D, then it will be used for constructing and separable 2D kernel. The operations will be much more efficient in this case.
The kernel is generally chosen as a gaussian when it is believed that the correlation of two pixel locations decrease with increasing distance. On the feature dimension, a uniform average is used since the weighting across features is not known.
Parameters: - n_input_plane – number of input plane, default is 1.
- kernel – kernel tensor, default is a 9 x 9 tensor.
>>> kernel = np.ones([9,9]).astype("float32") >>> spatialSubtractiveNormalization = SpatialSubtractiveNormalization(2,kernel) creating: createSpatialSubtractiveNormalization >>> spatialSubtractiveNormalization = SpatialSubtractiveNormalization() creating: createSpatialSubtractiveNormalization
-
class
bigdl.nn.layer.
SpatialZeroPadding
(pad_left, pad_right, pad_top, pad_bottom, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Each feature map of a given input is padded with specified number of zeros. If padding values are negative, then input is cropped.
Parameters: - padLeft – pad left position
- padRight – pad right position
- padTop – pad top position
- padBottom – pad bottom position
>>> spatialZeroPadding = SpatialZeroPadding(1, 1, 1, 1) creating: createSpatialZeroPadding
-
class
bigdl.nn.layer.
SplitTable
(dimension, n_input_dims=-1, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Creates a module that takes a Tensor as input and outputs several tables, splitting the Tensor along the specified dimension dimension. Please note the dimension starts from 1.
The input to this layer is expected to be a tensor, or a batch of tensors; when using mini-batch, a batch of sample tensors will be passed to the layer and the user needs to specify the number of dimensions of each sample tensor in a batch using nInputDims.
Parameters: - dimension – to be split along this dimension
- n_input_dims – specify the number of dimensions that this module will receiveIf it is more than the dimension of input tensors, the first dimensionwould be considered as batch size
>>> splitTable = SplitTable(1, 1) creating: createSplitTable
-
class
bigdl.nn.layer.
Sqrt
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Apply an element-wise sqrt operation.
>>> sqrt = Sqrt() creating: createSqrt
-
class
bigdl.nn.layer.
Square
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Apply an element-wise square operation.
>>> square = Square() creating: createSquare
-
class
bigdl.nn.layer.
Squeeze
(dim, num_input_dims=-2147483648, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Delete singleton all dimensions or a specific dim.
Parameters: - dim – Optional. The dimension to be delete. Default: delete all dimensions.
- num_input_dims – Optional. If in a batch model, set to the inputDims.
>>> squeeze = Squeeze(1) creating: createSqueeze
-
class
bigdl.nn.layer.
Sum
(dimension=1, n_input_dims=-1, size_average=False, squeeze=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
It is a simple layer which applies a sum operation over the given dimension. When nInputDims is provided, the input will be considered as a batches. Then the sum operation will be applied in (dimension + 1) The input to this layer is expected to be a tensor, or a batch of tensors; when using mini-batch, a batch of sample tensors will be passed to the layer and the user need to specify the number of dimensions of each sample tensor in the batch using nInputDims.
Parameters: - dimension – the dimension to be applied sum operation
- n_input_dims – specify the number of dimensions that this module will receiveIf it is more than the dimension of input tensors, the first dimensionwould be considered as batch size
- size_average – default is false, if it is true, it will return the mean instead
- squeeze – default is true, which will squeeze the sum dimension; set it to false to keep the sum dimension
>>> sum = Sum(1, 1, True, True) creating: createSum
-
class
bigdl.nn.layer.
Tanh
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies the Tanh function element-wise to the input Tensor, thus outputting a Tensor of the same dimension. Tanh is defined as f(x) = (exp(x)-exp(-x))/(exp(x)+exp(-x)).
>>> tanh = Tanh() creating: createTanh
-
class
bigdl.nn.layer.
TanhShrink
(bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
A simple layer for each element of the input tensor, do the following operation during the forward process: [f(x) = tanh(x) - 1]
>>> tanhShrink = TanhShrink() creating: createTanhShrink
-
class
bigdl.nn.layer.
TemporalConvolution
(input_frame_size, output_frame_size, kernel_w, stride_w=1, propagate_back=True, weight_regularizer=None, bias_regularizer=None, init_weight=None, init_bias=None, init_grad_weight=None, init_grad_bias=None, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies a 1D convolution over an input sequence composed of nInputFrame frames.. The input tensor in forward(input) is expected to be a 2D tensor (nInputFrame x inputFrameSize) or a 3D tensor (nBatchFrame x nInputFrame x inputFrameSize).
:param input_frame_size The input frame size expected in sequences given into forward() :param output_frame_size The output frame size the convolution layer will produce. :param kernel_w The kernel width of the convolution :param stride_w The step of the convolution in the width dimension. :param propagate_back Whether propagate gradient back, default is true. :param weight_regularizer instance of [[Regularizer]] (eg. L1 or L2 regularization), applied to the input weights matrices. :param bias_regularizer instance of [[Regularizer]] applied to the bias. :param init_weight Initial weight :param init_bias Initial bias :param init_grad_weight Initial gradient weight :param init_grad_bias Initial gradient bias
>>> temporalConvolution = TemporalConvolution(6, 12, 5, 5) creating: createTemporalConvolution >>> temporalConvolution.setWRegularizer(L1Regularizer(0.5)) creating: createL1Regularizer >>> temporalConvolution.setBRegularizer(L1Regularizer(0.5)) creating: createL1Regularizer
-
class
bigdl.nn.layer.
Threshold
(th=1e-06, v=0.0, ip=False, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Threshold input Tensor. If values in the Tensor smaller than th, then replace it with v
Parameters: - th – the threshold to compare with
- v – the value to replace with
- ip – inplace mode
>>> threshold = Threshold(1e-5, 1e-5, True) creating: createThreshold
-
class
bigdl.nn.layer.
TimeDistributed
(model, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
This layer is intended to apply contained layer to each temporal time slice of input tensor.
For instance, The TimeDistributed Layer can feed each time slice of input tensor to the Linear layer.
The input data format is [Batch, Time, Other dims]. For the contained layer, it must not change the Other dims length.
>>> td = TimeDistributed(Linear(2, 3)) creating: createLinear creating: createTimeDistributed
-
class
bigdl.nn.layer.
Transpose
(permutations, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Transpose input along specified dimensions
Parameters: permutations – dimension pairs that need to swap >>> transpose = Transpose([(1,2)]) creating: createTranspose
-
class
bigdl.nn.layer.
Unsqueeze
(pos, num_input_dims=-2147483648, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Create an Unsqueeze layer. Insert singleton dim (i.e., dimension 1) at position pos. For an input with dim = input.dim(), there are dim + 1 possible positions to insert the singleton dimension.
Parameters: - pos – The position will be insert singleton.
- num_input_dims – Optional. If in a batch model, set to the inputDim
>>> unsqueeze = Unsqueeze(1, 1) creating: createUnsqueeze
-
class
bigdl.nn.layer.
View
(sizes, num_input_dims=0, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
This module creates a new view of the input tensor using the sizes passed to the constructor. The method setNumInputDims() allows to specify the expected number of dimensions of the inputs of the modules. This makes it possible to use minibatch inputs when using a size -1 for one of the dimensions.
Parameters: size – sizes use for creates a new view >>> view = View([1024,2]) creating: createView
-
class
bigdl.nn.layer.
VolumetricConvolution
(n_input_plane, n_output_plane, k_t, k_w, k_h, d_t=1, d_w=1, d_h=1, pad_t=0, pad_w=0, pad_h=0, with_bias=True, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies a 3D convolution over an input image composed of several input planes. The input tensor in forward(input) is expected to be a 4D tensor (nInputPlane x time x height x width).
Parameters: - n_input_plane – The number of expected input planes in the image given into forward()
- n_output_plane – The number of output planes the convolution layer will produce.
- k_t – The kernel size of the convolution in time
- k_w – The kernel width of the convolution
- k_h – The kernel height of the convolution
- d_t – The step of the convolution in the time dimension. Default is 1
- d_w – The step of the convolution in the width dimension. Default is 1
- d_h – The step of the convolution in the height dimension. Default is 1
- pad_t – Additional zeros added to the input plane data on both sides of time axis.Default is 0. (kT-1)/2 is often used here.
- pad_w – The additional zeros added per width to the input planes.
- pad_h – The additional zeros added per height to the input planes.
- with_bias – whether with bias
- init_method – Init method, Default, Xavier, Bilinear.
>>> volumetricConvolution = VolumetricConvolution(6, 12, 5, 5, 5, 1, 1, 1) creating: createVolumetricConvolution
-
class
bigdl.nn.layer.
VolumetricMaxPooling
(k_t, k_w, k_h, d_t, d_w, d_h, pad_t=0, pad_w=0, pad_h=0, bigdl_type='float')[source]¶ Bases:
bigdl.nn.layer.Layer
Applies 3D max-pooling operation in kTxkWxkH regions by step size dTxdWxdH. The number of output features is equal to the number of input planes / dT. The input can optionally be padded with zeros. Padding should be smaller than half of kernel size. That is, padT < kT/2, padW < kW/2 and padH < kH/2
Parameters: - k_t – The kernel size
- k_w – The kernel width
- k_h – The kernel height
- d_t – The step in the time dimension
- d_w – The step in the width dimension
- d_h – The step in the height dimension
- pad_t – The padding in the time dimension
- pad_w – The padding in the width dimension
- pad_h – The padding in the height dimension
>>> volumetricMaxPooling = VolumetricMaxPooling(5, 5, 5, 1, 1, 1) creating: createVolumetricMaxPooling