L1Cost
Scala:
val layer = L1Cost[Float]()
Python:
layer = L1Cost()
Compute L1 norm for input, and sign of input
Scala example:
val layer = L1Cost[Float]()
val input = Tensor[Float](2, 2).rand
val target = Tensor[Float](2, 2).rand
val output = layer.forward(input, target)
val gradInput = layer.backward(input, target)
> println(input)
input: com.intel.analytics.bigdl.tensor.Tensor[Float] =
0.48145306 0.476887
0.23729686 0.5169516
[com.intel.analytics.bigdl.tensor.DenseTensor of size 2x2]
> println(target)
target: com.intel.analytics.bigdl.tensor.Tensor[Float] =
0.42999148 0.22272833
0.49723643 0.17884709
[com.intel.analytics.bigdl.tensor.DenseTensor of size 2x2]
> println(output)
output: Float = 1.7125885
> println(gradInput)
gradInput: com.intel.analytics.bigdl.tensor.Tensor[Float] =
1.0 1.0
1.0 1.0
[com.intel.analytics.bigdl.tensor.DenseTensor of size 2x2]
Python example:
layer = L1Cost()
input = np.random.uniform(0, 1, (2, 2)).astype("float32")
target = np.random.uniform(0, 1, (2, 2)).astype("float32")
output = layer.forward(input, target)
gradInput = layer.backward(input, target)
> output
2.522411
> gradInput
[array([[ 1., 1.],
[ 1., 1.]], dtype=float32)]
TimeDistributedCriterion
Scala:
val module = TimeDistributedCriterion(critrn, sizeAverage)
Python:
module = TimeDistributedCriterion(critrn, sizeAverage)
This class is intended to support inputs with 3 or more dimensions. Apply Any Provided Criterion to every temporal slice of an input.
critrn
embedded criterionsizeAverage
whether to divide the sequence length. Default is false.
Scala example:
import com.intel.analytics.bigdl.nn._
import com.intel.analytics.bigdl.tensor.Tensor
import com.intel.analytics.bigdl.tensor.Storage
val criterion = ClassNLLCriterion[Double]()
val layer = TimeDistributedCriterion[Double](criterion, true)
val input = Tensor[Double](Storage(Array(
1.0262627674932,
-1.2412600935171,
-1.0423174168648,
-1.0262627674932,
-1.2412600935171,
-1.0423174168648,
-0.90330565804228,
-1.3686840144413,
-1.0778380454479,
-0.90330565804228,
-1.3686840144413,
-1.0778380454479,
-0.99131220658219,
-1.0559142847536,
-1.2692712660404,
-0.99131220658219,
-1.0559142847536,
-1.2692712660404))).resize(3, 2, 3)
val target = Tensor[Double](3, 2)
target(Array(1, 1)) = 1
target(Array(1, 2)) = 1
target(Array(2, 1)) = 2
target(Array(2, 2)) = 2
target(Array(3, 1)) = 3
target(Array(3, 2)) = 3
> print(layer.forward(input, target))
0.8793184268272332
Python example:
from bigdl.nn.criterion import *
criterion = ClassNLLCriterion()
layer = TimeDistributedCriterion(criterion, True)
input = np.array([1.0262627674932,
-1.2412600935171,
-1.0423174168648,
-1.0262627674932,
-1.2412600935171,
-1.0423174168648,
-0.90330565804228,
-1.3686840144413,
-1.0778380454479,
-0.90330565804228,
-1.3686840144413,
-1.0778380454479,
-0.99131220658219,
-1.0559142847536,
-1.2692712660404,
-0.99131220658219,
-1.0559142847536,
-1.2692712660404]).reshape(3,2,3)
target = np.array([[1,1],[2,2],[3,3]])
>layer.forward(input, target)
0.8793184
MarginRankingCriterion
Scala:
val mse = new MarginRankingCriterion(margin=1.0, sizeAverage=true)
Python:
mse = MarginRankingCriterion(margin=1.0, size_average=true)
Creates a criterion that measures the loss given an input x = {x1, x2}
,
a table of two Tensors of size 1 (they contain only scalars), and a label y (1 or -1).
In batch mode, x is a table of two Tensors of size batchsize, and y is a Tensor of size
batchsize containing 1 or -1 for each corresponding pair of elements in the input Tensor.
If y == 1
then it assumed the first input should be ranked higher (have a larger value) than
the second input, and vice-versa for y == -1
.
Scala example:
import com.intel.analytics.bigdl.nn.MarginRankingCriterion
import com.intel.analytics.bigdl.tensor.Tensor
import com.intel.analytics.bigdl.tensor.Storage
import com.intel.analytics.bigdl.tensor.TensorNumericMath.TensorNumeric.NumericFloat
import com.intel.analytics.bigdl.utils.T
import scala.util.Random
val input1Arr = Array(1, 2, 3, 4, 5)
val input2Arr = Array(5, 4, 3, 2, 1)
val target1Arr = Array(-1, 1, -1, 1, 1)
val input1 = Tensor(Storage(input1Arr.map(x => x.toFloat)))
val input2 = Tensor(Storage(input2Arr.map(x => x.toFloat)))
val input = T((1.toFloat, input1), (2.toFloat, input2))
val target1 = Tensor(Storage(target1Arr.map(x => x.toFloat)))
val target = T((1.toFloat, target1))
val mse = new MarginRankingCriterion()
val output = mse.forward(input, target)
val gradInput = mse.backward(input, target)
println(output)
println(gradInput)
Gives the output
output: Float = 0.8 [21/154]
Gives the gradInput,
gradInput: com.intel.analytics.bigdl.utils.Table =
{
2: -0.0
0.2
-0.2
0.0
0.0
[com.intel.analytics.bigdl.tensor.DenseTensor of size 5]
1: 0.0
-0.2
0.2
-0.0
-0.0
[com.intel.analytics.bigdl.tensor.DenseTensor of size 5]
}
Python example:
from bigdl.nn.layer import *
from bigdl.nn.criterion import *
from bigdl.optim.optimizer import *
from bigdl.util.common import *
mse = MarginRankingCriterion()
input1 = np.array([1, 2, 3, 4, 5]).astype("float32")
input2 = np.array([5, 4, 3, 2, 1]).astype("float32")
input = [input1, input2]
target1 = np.array([-1, 1, -1, 1, 1]).astype("float32")
target = [target1, target1]
output = mse.forward(input, target)
gradInput = mse.backward(input, target)
print output
print gradInput
Gives the output,
0.8
Gives the gradInput,
[array([ 0. , -0.2, 0.2, -0. , -0. ], dtype=float32), array([-0. , 0.2, -0.2, 0. , 0. ], dtype=float32)]
ClassNLLCriterion
Scala:
val criterion = ClassNLLCriterion(weights = null, sizeAverage = true)
Python:
criterion = ClassNLLCriterion(weights=None, size_average=True)
The negative log likelihood criterion. It is useful to train a classification problem with n classes. If provided, the optional argument weights should be a 1D Tensor assigning weight to each of the classes. This is particularly useful when you have an unbalanced training set.
The input given through a forward()
is expected to contain log-probabilities of each class:
input has to be a 1D Tensor of size n
. Obtaining log-probabilities in a neural network is easily
achieved by adding a LogSoftMax
layer in the last layer of your neural network. You may use
CrossEntropyCriterion
instead, if you prefer not to add an extra layer to your network. This
criterion expects a class index (1 to the number of class) as target when calling
forward(input, target)
and backward(input, target)
.
The loss can be described as:
loss(x, class) = -x[class]
or in the case of the weights argument it is specified as follows:
loss(x, class) = -weights[class] * x[class]
Due to the behaviour of the backend code, it is necessary to set sizeAverage to false when
calculating losses in non-batch mode.
Note that if the target is -1
, the training process will skip this sample.
In other words, the forward process will return zero output and the backward process
will also return zero gradInput
.
By default, the losses are averaged over observations for each minibatch. However, if the field
sizeAverage
is set to false, the losses are instead summed for each minibatch.
Parameters:
weights
weights of each element of the inputsizeAverage
A boolean indicating whether normalizing by the number of elements in the input. Default: true
Scala example:
import com.intel.analytics.bigdl.nn.ClassNLLCriterion
import com.intel.analytics.bigdl.tensor.Tensor
import com.intel.analytics.bigdl.numeric.NumericFloat
import com.intel.analytics.bigdl.utils.T
val criterion = ClassNLLCriterion()
val input = Tensor(T(
T(1f, 2f, 3f),
T(2f, 3f, 4f),
T(3f, 4f, 5f)
))
val target = Tensor(T(1f, 2f, 3f))
val loss = criterion.forward(input, target)
val grad = criterion.backward(input, target)
print(loss)
-3.0
println(grad)
-0.33333334 0.0 0.0
0.0 -0.33333334 0.0
0.0 0.0 -0.33333334
[com.intel.analytics.bigdl.tensor.DenseTensor of size 3x3]
Python example:
from bigdl.nn.layer import *
from bigdl.nn.criterion import *
criterion = ClassNLLCriterion()
input = np.array([
[1.0, 2.0, 3.0],
[2.0, 3.0, 4.0],
[3.0, 4.0, 5.0]
])
target = np.array([1.0, 2.0, 3.0])
loss = criterion.forward(input, target)
gradient= criterion.backward(input, target)
print loss
-3.0
print gradient
-3.0
[[-0.33333334 0. 0. ]
[ 0. -0.33333334 0. ]
[ 0. 0. -0.33333334]]
SoftmaxWithCriterion
Scala:
val model = SoftmaxWithCriterion(ignoreLabel, normalizeMode)
Python:
model = SoftmaxWithCriterion(ignoreLabel, normalizeMode)
Computes the multinomial logistic loss for a one-of-many classification task, passing real-valued predictions through a softmax to get a probability distribution over classes. It should be preferred over separate SoftmaxLayer + MultinomialLogisticLossLayer as its gradient computation is more numerically stable.
ignoreLabel
(optional) Specify a label value that should be ignored when computing the loss.normalizeMode
How to normalize the output loss.
Scala example:
import com.intel.analytics.bigdl.tensor.TensorNumericMath.TensorNumeric.NumericFloat
import com.intel.analytics.bigdl.nn._
import com.intel.analytics.bigdl.tensor.{Storage, Tensor}
val input = Tensor(1, 5, 2, 3).rand()
val target = Tensor(Storage(Array(2.0f, 4.0f, 2.0f, 4.0f, 1.0f, 2.0f))).resize(1, 1, 2, 3)
val model = SoftmaxWithCriterion[Float]()
val output = model.forward(input, target)
scala> print(input)
(1,1,.,.) =
0.65131104 0.9332143 0.5618989
0.9965054 0.9370902 0.108070895
(1,2,.,.) =
0.46066576 0.9636703 0.8123812
0.31076035 0.16386998 0.37894428
(1,3,.,.) =
0.49111295 0.3704862 0.9938375
0.87996656 0.8695406 0.53354675
(1,4,.,.) =
0.8502225 0.9033509 0.8518651
0.0692618 0.10121379 0.970959
(1,5,.,.) =
0.9397213 0.49688303 0.75739735
0.25074655 0.11416598 0.6594504
[com.intel.analytics.bigdl.tensor.DenseTensor$mcF$sp of size 1x5x2x3]
scala> print(output)
1.6689054
Python example:
input = np.random.randn(1, 5, 2, 3)
target = np.array([[[[2.0, 4.0, 2.0], [4.0, 1.0, 2.0]]]])
model = SoftmaxWithCriterion()
output = model.forward(input, target)
>>> print input
[[[[ 0.78455689 0.01402084 0.82539628]
[-1.06448238 2.58168413 0.60053703]]
[[-0.48617618 0.44538094 0.46611658]
[-1.41509329 0.40038991 -0.63505732]]
[[ 0.91266769 1.68667933 0.92423611]
[ 0.1465411 0.84637557 0.14917515]]
[[-0.7060493 -2.02544114 0.89070726]
[ 0.14535539 0.73980064 -0.33130613]]
[[ 0.64538791 -0.44384233 -0.40112523]
[ 0.44346658 -2.22303621 0.35715986]]]]
>>> print output
2.1002123
SmoothL1Criterion
Scala:
val slc = SmoothL1Criterion(sizeAverage=true)
Python:
slc = SmoothL1Criterion(size_average=True)
Creates a criterion that can be thought of as a smooth version of the AbsCriterion. It uses a squared term if the absolute element-wise error falls below 1. It is less sensitive to outliers than the MSECriterion and in some cases prevents exploding gradients (e.g. see "Fast R-CNN" paper by Ross Girshick).
Scala example:
import com.intel.analytics.bigdl.tensor.{Tensor, Storage}
import com.intel.analytics.bigdl.tensor.TensorNumericMath.TensorNumeric.NumericFloat
import com.intel.analytics.bigdl.nn.SmoothL1Criterion
val slc = SmoothL1Criterion()
val inputArr = Array(
0.17503996845335,
0.83220188552514,
0.48450597329065,
0.64701424003579,
0.62694586534053,
0.34398410236463,
0.55356747563928,
0.20383032318205
)
val targetArr = Array(
0.69956525065936,
0.86074831243604,
0.54923197557218,
0.57388074393384,
0.63334444304928,
0.99680578662083,
0.49997645849362,
0.23869121982716
)
val input = Tensor(Storage(inputArr.map(x => x.toFloat))).reshape(Array(2, 2, 2))
val target = Tensor(Storage(targetArr.map(x => x.toFloat))).reshape(Array(2, 2, 2))
val output = slc.forward(input, target)
val gradInput = slc.backward(input, target)
Gives the output,
output: Float = 0.0447365
Gives the gradInput,
gradInput: com.intel.analytics.bigdl.tensor.Tensor[Float] =
(1,.,.) =
-0.06556566 -0.003568299
-0.008090746 0.009141691
(2,.,.) =
-7.998273E-4 -0.08160271
0.0066988766 -0.0043576136
Python example:
from bigdl.nn.layer import *
from bigdl.nn.criterion import *
from bigdl.optim.optimizer import *
from bigdl.util.common import *
slc = SmoothL1Criterion()
input = np.array([
0.17503996845335,
0.83220188552514,
0.48450597329065,
0.64701424003579,
0.62694586534053,
0.34398410236463,
0.55356747563928,
0.20383032318205
])
input.reshape(2, 2, 2)
target = np.array([
0.69956525065936,
0.86074831243604,
0.54923197557218,
0.57388074393384,
0.63334444304928,
0.99680578662083,
0.49997645849362,
0.23869121982716
])
target.reshape(2, 2, 2)
output = slc.forward(input, target)
gradInput = slc.backward(input, target)
print output
print gradInput
SmoothL1CriterionWithWeights
Scala:
val smcod = SmoothL1CriterionWithWeights[Float](sigma: Float = 2.4f, num: Int = 2)
Python:
smcod = SmoothL1CriterionWithWeights(sigma, num)
a smooth version of the AbsCriterion It uses a squared term if the absolute element-wise error falls below 1. It is less sensitive to outliers than the MSECriterion and in some cases prevents exploding gradients (e.g. see "Fast R-CNN" paper by Ross Girshick).
d = (x - y) * w_in
loss(x, y, w_in, w_out)
| 0.5 * (sigma * d_i)^2 * w_out if |d_i| < 1 / sigma / sigma
= 1/n \sum |
| (|d_i| - 0.5 / sigma / sigma) * w_out otherwise
Scala example:
val smcod = SmoothL1CriterionWithWeights[Float](2.4f, 2)
val inputArr = Array(1.1, -0.8, 0.1, 0.4, 1.3, 0.2, 0.2, 0.03)
val targetArr = Array(0.9, 1.5, -0.08, -1.68, -0.68, -1.17, -0.92, 1.58)
val inWArr = Array(-0.1, 1.7, -0.8, -1.9, 1.0, 1.4, 0.8, 0.8)
val outWArr = Array(-1.9, -0.5, 1.9, -1.0, -0.2, 0.1, 0.3, 1.1)
val input = Tensor(Storage(inputArr.map(x => x.toFloat)))
val target = T()
target.insert(Tensor(Storage(targetArr.map(x => x.toFloat))))
target.insert(Tensor(Storage(inWArr.map(x => x.toFloat))))
target.insert(Tensor(Storage(outWArr.map(x => x.toFloat))))
val output = smcod.forward(input, target)
val gradInput = smcod.backward(input, target)
> println(output)
output: Float = -2.17488
> println(gradInput)
-0.010944003
0.425
0.63037443
-0.95
-0.1
0.07
0.120000005
-0.44000003
[com.intel.analytics.bigdl.tensor.DenseTensor of size 8]
Python example:
smcod = SmoothL1CriterionWithWeights(2.4, 2)
input = np.array([1.1, -0.8, 0.1, 0.4, 1.3, 0.2, 0.2, 0.03]).astype("float32")
targetArr = np.array([0.9, 1.5, -0.08, -1.68, -0.68, -1.17, -0.92, 1.58]).astype("float32")
inWArr = np.array([-0.1, 1.7, -0.8, -1.9, 1.0, 1.4, 0.8, 0.8]).astype("float32")
outWArr = np.array([-1.9, -0.5, 1.9, -1.0, -0.2, 0.1, 0.3, 1.1]).astype("float32")
target = [targetArr, inWArr, outWArr]
output = smcod.forward(input, target)
gradInput = smcod.backward(input, target)
> output
-2.17488
> gradInput
[array([-0.010944 , 0.42500001, 0.63037443, -0.94999999, -0.1 ,
0.07 , 0.12 , -0.44000003], dtype=float32)]
MultiMarginCriterion
Scala:
val loss = MultiMarginCriterion(p=1,weights=null,margin=1.0,sizeAverage=true)
Python:
loss = MultiMarginCriterion(p=1,weights=None,margin=1.0,size_average=True)
MultiMarginCriterion is a loss function that optimizes a multi-class classification hinge loss (margin-based loss) between input x
and output y
(y
is the target class index).
Scala example:
scala>
import com.intel.analytics.bigdl.tensor.TensorNumericMath.TensorNumeric.NumericFloat
import com.intel.analytics.bigdl.nn._
import com.intel.analytics.bigdl.tensor._
import com.intel.analytics.bigdl.tensor.Storage
val input = Tensor(3,2).randn()
val target = Tensor(Storage(Array(2.0f, 1.0f, 2.0f)))
val loss = MultiMarginCriterion(1)
val output = loss.forward(input,target)
val grad = loss.backward(input,target)
scala> print(input)
-0.45896783 -0.80141246
0.22560088 -0.13517438
0.2601126 0.35492152
[com.intel.analytics.bigdl.tensor.DenseTensor$mcF$sp of size 3x2]
scala> print(target)
2.0
1.0
2.0
[com.intel.analytics.bigdl.tensor.DenseTensor$mcF$sp of size 3]
scala> print(output)
0.4811434
scala> print(grad)
0.16666667 -0.16666667
-0.16666667 0.16666667
0.16666667 -0.16666667
[com.intel.analytics.bigdl.tensor.DenseTensor of size 3x2]
Python example:
from bigdl.nn.criterion import *
import numpy as np
input = np.random.randn(3,2)
target = np.array([2,1,2])
print "input=",input
print "target=",target
loss = MultiMarginCriterion(1)
out = loss.forward(input, target)
print "output of loss is : ",out
grad_out = loss.backward(input,target)
print "grad out of loss is : ",grad_out
Gives the output,
input= [[ 0.46868305 -2.28562261]
[ 0.8076243 -0.67809689]
[-0.20342555 -0.66264743]]
target= [2 1 2]
creating: createMultiMarginCriterion
output of loss is : 0.8689213
grad out of loss is : [[ 0.16666667 -0.16666667]
[ 0. 0. ]
[ 0.16666667 -0.16666667]]
HingeEmbeddingCriterion
Scala:
val m = HingeEmbeddingCriterion(margin = 1, sizeAverage = true)
Python:
m = HingeEmbeddingCriterion(margin=1, size_average=True)
Creates a criterion that measures the loss given an input x
which is a 1-dimensional vector and a label y
(1
or -1
).
This is usually used for measuring whether two inputs are similar or dissimilar, e.g. using the L1 pairwise distance, and is typically used for learning nonlinear embeddings or semi-supervised learning.
⎧ x_i, if y_i == 1
loss(x, y) = 1/n ⎨
⎩ max(0, margin - x_i), if y_i == -1
Scala example:
import com.intel.analytics.bigdl.utils.{T}
import com.intel.analytics.bigdl.tensor.Tensor
import com.intel.analytics.bigdl.nn._
import com.intel.analytics.bigdl.utils.{T}
import com.intel.analytics.bigdl.tensor.TensorNumericMath.TensorNumeric.NumericFloat
val loss = HingeEmbeddingCriterion(1, sizeAverage = false)
val input = Tensor(T(0.1f, 2.0f, 2.0f, 2.0f))
println("input: \n" + input)
println("ouput: ")
println("Target=1: " + loss.forward(input, Tensor(4, 1).fill(1f)))
println("Target=-1: " + loss.forward(input, Tensor(4, 1).fill(-1f)))
input:
0.1
2.0
2.0
2.0
[com.intel.analytics.bigdl.tensor.DenseTensor of size 4]
ouput:
Target=1: 6.1
Target=-1: 0.9
Python example:
import numpy as np
from bigdl.nn.criterion import *
input = np.array([0.1, 2.0, 2.0, 2.0])
target = np.full(4, 1)
print("input: " )
print(input)
print("target: ")
print(target)
print("output: ")
print(HingeEmbeddingCriterion(1.0, size_average= False).forward(input, target))
print(HingeEmbeddingCriterion(1.0, size_average= False).forward(input, np.full(4, -1)))
input:
[ 0.1 2. 2. 2. ]
target:
[1 1 1 1]
output:
creating: createHingeEmbeddingCriterion
6.1
creating: createHingeEmbeddingCriterion
0.9
MarginCriterion
Scala:
criterion = MarginCriterion(margin=1.0, sizeAverage=true)
Python:
criterion = MarginCriterion(margin=1.0, sizeAverage=true, bigdl_type="float")
Creates a criterion that optimizes a two-class classification hinge loss (margin-based loss) between input x (a Tensor of dimension 1) and output y.
* margin
if unspecified, is by default 1.
* sizeAverage
whether to average the loss, is by default true
Scala example:
val criterion = MarginCriterion(margin=1.0, sizeAverage=true)
val input = Tensor(3, 2).rand()
input: com.intel.analytics.bigdl.tensor.Tensor[Float] =
0.33753583 0.3575501
0.23477706 0.7240361
0.92835575 0.4737949
[com.intel.analytics.bigdl.tensor.DenseTensor$mcF$sp of size 3x2]
val target = Tensor(3, 2).rand()
target: com.intel.analytics.bigdl.tensor.Tensor[Float] =
0.27280563 0.7022703
0.3348442 0.43332106
0.08935371 0.17876455
[com.intel.analytics.bigdl.tensor.DenseTensor$mcF$sp of size 3x2]
criterion.forward(input, target)
res5: Float = 0.84946966
Python example:
criterion = MarginCriterion(margin=1.0,size_average=True,bigdl_type="float")
input = np.random.rand(3, 2)
array([[ 0.20824672, 0.67299837],
[ 0.80561452, 0.19564743],
[ 0.42501441, 0.19408184]])
target = np.random.rand(3, 2)
array([[ 0.67882632, 0.61257846],
[ 0.10111138, 0.75225082],
[ 0.60404296, 0.31373273]])
criterion.forward(input, target)
0.8166871
CosineEmbeddingCriterion
Scala:
val cosineEmbeddingCriterion = CosineEmbeddingCriterion(margin = 0.0, sizeAverage = true)
Python:
cosineEmbeddingCriterion = CosineEmbeddingCriterion( margin=0.0,size_average=True)
CosineEmbeddingCriterion creates a criterion that measures the loss given an input x = {x1, x2}, a table of two Tensors, and a Tensor label y with values 1 or -1.
Scala example:
import com.intel.analytics.bigdl.tensor.TensorNumericMath.TensorNumeric.NumericFloat
import com.intel.analytics.bigdl.nn._
import com.intel.analytics.bigdl.tensor._
impot com.intel.analytics.bigdl.utils.T
val cosineEmbeddingCriterion = CosineEmbeddingCriterion(0.0, false)
val input1 = Tensor(5).rand()
val input2 = Tensor(5).rand()
val input = T()
input(1.0) = input1
input(2.0) = input2
val target1 = Tensor(Storage(Array(-0.5f)))
val target = T()
target(1.0) = target1
> print(input)
{
2.0: 0.4110882
0.57726574
0.1949834
0.67670715
0.16984987
[com.intel.analytics.bigdl.tensor.DenseTensor$mcF$sp of size 5]
1.0: 0.16878392
0.24124223
0.8964794
0.11156334
0.5101486
[com.intel.analytics.bigdl.tensor.DenseTensor$mcF$sp of size 5]
}
> print(cosineEmbeddingCriterion.forward(input, target))
0.49919847
> print(cosineEmbeddingCriterion.backward(input, target))
{
2: -0.045381278
-0.059856333
0.72547954
-0.2268434
0.3842142
[com.intel.analytics.bigdl.tensor.DenseTensor of size 5]
1: 0.30369008
0.42463788
-0.20637506
0.5712836
-0.06355385
[com.intel.analytics.bigdl.tensor.DenseTensor of size 5]
}
Python example:
from bigdl.nn.layer import *
cosineEmbeddingCriterion = CosineEmbeddingCriterion(0.0, False)
> cosineEmbeddingCriterion.forward([np.array([1.0, 2.0, 3.0, 4.0 ,5.0]),np.array([5.0, 4.0, 3.0, 2.0, 1.0])],[np.array(-0.5)])
0.6363636
> cosineEmbeddingCriterion.backward([np.array([1.0, 2.0, 3.0, 4.0 ,5.0]),np.array([5.0, 4.0, 3.0, 2.0, 1.0])],[np.array(-0.5)])
[array([ 0.07933884, 0.04958678, 0.01983471, -0.00991735, -0.03966942], dtype=float32), array([-0.03966942, -0.00991735, 0.01983471, 0.04958678, 0.07933884], dtype=float32)]
BCECriterion
Scala:
val criterion = BCECriterion[Float]()
Python:
criterion = BCECriterion()
This loss function measures the Binary Cross Entropy between the target and the output
loss(o, t) = - 1/n sum_i (t[i] * log(o[i]) + (1 - t[i]) * log(1 - o[i]))
or in the case of the weights argument being specified:
loss(o, t) = - 1/n sum_i weights[i] * (t[i] * log(o[i]) + (1 - t[i]) * log(1 - o[i]))
By default, the losses are averaged for each mini-batch over observations as well as over dimensions. However, if the field sizeAverage is set to false, the losses are instead summed.
Scala example:
val criterion = BCECriterion[Float]()
val input = Tensor[Float](3, 1).rand
val target = Tensor[Float](3)
target(1) = 1
target(2) = 0
target(3) = 1
val output = criterion.forward(input, target)
val gradInput = criterion.backward(input, target)
> println(target)
res25: com.intel.analytics.bigdl.tensor.Tensor[Float] =
1.0
0.0
1.0
[com.intel.analytics.bigdl.tensor.DenseTensor of size 3]
> println(output)
output: Float = 0.9009579
> println(gradInput)
gradInput: com.intel.analytics.bigdl.tensor.Tensor[Float] =
-1.5277504
1.0736246
-0.336957
[com.intel.analytics.bigdl.tensor.DenseTensor of size 3x1]
Python example:
criterion = BCECriterion()
input = np.random.uniform(0, 1, (3, 1)).astype("float32")
target = np.array([1, 0, 1])
output = criterion.forward(input, target)
gradInput = criterion.backward(input, target)
> output
1.9218739
> gradInput
[array([[-4.3074522 ],
[ 2.24244714],
[-1.22368968]], dtype=float32)]
DiceCoefficientCriterion
Scala:
val loss = DiceCoefficientCriterion(sizeAverage=true, epsilon=1.0f)
Python:
loss = DiceCoefficientCriterion(size_average=True,epsilon=1.0)
DiceCoefficientCriterion is the Dice-Coefficient objective function.
Both forward
and backward
accept two tensors : input and target. The forward
result is formulated as
1 - (2 * (input intersection target) / (input union target))
Scala example:
scala>
import com.intel.analytics.bigdl.tensor.TensorNumericMath.TensorNumeric.NumericFloat
import com.intel.analytics.bigdl.nn._
import com.intel.analytics.bigdl.tensor._
import com.intel.analytics.bigdl.tensor.Storage
val input = Tensor(2).randn()
val target = Tensor(Storage(Array(2.0f, 1.0f)))
val loss = DiceCoefficientCriterion(epsilon = 1.0f)
val output = loss.forward(input,target)
val grad = loss.backward(input,target)
scala> print(input)
-0.50278
0.51387966
[com.intel.analytics.bigdl.tensor.DenseTensor$mcF$sp of size 2]
scala> print(target)
2.0
1.0
[com.intel.analytics.bigdl.tensor.DenseTensor$mcF$sp of size 2]
scala> print(output)
0.9958517
scala> print(grad)
-0.99619853 -0.49758217
[com.intel.analytics.bigdl.tensor.DenseTensor of size 1x2]
Python example:
from bigdl.nn.criterion import *
import numpy as np
input = np.random.randn(2)
target = np.array([2,1],dtype='float64')
print "input=", input
print "target=", target
loss = DiceCoefficientCriterion(size_average=True,epsilon=1.0)
out = loss.forward(input,target)
print "output of loss is :",out
grad_out = loss.backward(input,target)
print "grad out of loss is :",grad_out
produces output:
input= [ 0.4440505 2.9430301]
target= [ 2. 1.]
creating: createDiceCoefficientCriterion
output of loss is : -0.17262316
grad out of loss is : [[-0.38274616 -0.11200322]]
MSECriterion
Scala:
val criterion = MSECriterion()
Python:
criterion = MSECriterion()
The mean squared error criterion e.g. input: a, target: b, total elements: n
loss(a, b) = 1/n * sum(|a_i - b_i|^2)
Parameters:
sizeAverage
a boolean indicating whether to divide the sum of squared error by n. Default: true
Scala example:
import com.intel.analytics.bigdl.nn._
import com.intel.analytics.bigdl.utils.T
import com.intel.analytics.bigdl.tensor.Tensor
import com.intel.analytics.bigdl.tensor.TensorNumericMath.TensorNumeric.NumericFloat
val criterion = MSECriterion()
val input = Tensor(T(
T(1.0f, 2.0f),
T(3.0f, 4.0f))
)
val target = Tensor(T(
T(2.0f, 3.0f),
T(4.0f, 5.0f))
)
val output = criterion.forward(input, target)
val gradient = criterion.backward(input, target)
-> print(output)
1.0
-> print(gradient)
-0.5 -0.5
-0.5 -0.5
[com.intel.analytics.bigdl.tensor.DenseTensor of size 2x2]
Python example:
from bigdl.nn.layer import *
from bigdl.nn.criterion import *
import numpy as np
criterion = MSECriterion()
input = np.array([
[1.0, 2.0],
[3.0, 4.0]
])
target = np.array([
[2.0, 3.0],
[4.0, 5.0]
])
output = criterion.forward(input, target)
gradient= criterion.backward(input, target)
-> print output
1.0
-> print gradient
[[-0.5 -0.5]
[-0.5 -0.5]]
SoftMarginCriterion
Scala:
val criterion = SoftMarginCriterion(sizeAverage)
Python:
criterion = SoftMarginCriterion(size_average)
Creates a criterion that optimizes a two-class classification logistic loss between input x (a Tensor of dimension 1) and output y (which is a tensor containing either 1s or -1s).
loss(x, y) = sum_i (log(1 + exp(-y[i]*x[i]))) / x:nElement()
Parameters:
* sizeAverage
A boolean indicating whether normalizing by the number of elements in the input.
Default: true
Scala example:
import com.intel.analytics.bigdl.nn._
import com.intel.analytics.bigdl.utils.T
import com.intel.analytics.bigdl.tensor.Tensor
import com.intel.analytics.bigdl.tensor.TensorNumericMath.TensorNumeric.NumericFloat
val criterion = SoftMarginCriterion()
val input = Tensor(T(
T(1.0f, 2.0f),
T(3.0f, 4.0f))
)
val target = Tensor(T(
T(1.0f, -1.0f),
T(-1.0f, 1.0f))
)
val output = criterion.forward(input, target)
val gradient = criterion.backward(input, target)
-> print(output)
1.3767318
-> print(gradient)
-0.06723536 0.22019927
0.23814353 -0.0044965525
[com.intel.analytics.bigdl.tensor.DenseTensor of size 2x2]
Python example:
from bigdl.nn.layer import *
from bigdl.nn.criterion import *
import numpy as np
criterion = SoftMarginCriterion()
input = np.array([
[1.0, 2.0],
[3.0, 4.0]
])
target = np.array([
[2.0, 3.0],
[4.0, 5.0]
])
output = criterion.forward(input, target)
gradient = criterion.backward(input, target)
-> print output
1.3767318
-> print gradient
[[-0.06723536 0.22019927]
[ 0.23814353 -0.00449655]]
DistKLDivCriterion
Scala:
val loss = DistKLDivCriterion[T](sizeAverage=true)
Python:
loss = DistKLDivCriterion(size_average=True)
DistKLDivCriterion is the Kullback–Leibler divergence loss.
Scala example:
scala>
import com.intel.analytics.bigdl.tensor.TensorNumericMath.TensorNumeric.NumericFloat
import com.intel.analytics.bigdl.nn._
import com.intel.analytics.bigdl.tensor._
import com.intel.analytics.bigdl.tensor.Storage
val input = Tensor(2).randn()
val target = Tensor(Storage(Array(2.0f, 1.0f)))
val loss = DistKLDivCriterion()
val output = loss.forward(input,target)
val grad = loss.backward(input,target)
scala> print(input)
-0.3854126
-0.7707398
[com.intel.analytics.bigdl.tensor.DenseTensor$mcF$sp of size 2]
scala> print(target)
2.0
1.0
[com.intel.analytics.bigdl.tensor.DenseTensor$mcF$sp of size 2]
scala> print(output)
1.4639297
scala> print(grad)
-1.0
-0.5
[com.intel.analytics.bigdl.tensor.DenseTensor of size 2]
Python example:
from bigdl.nn.criterion import *
import numpy as np
input = np.random.randn(2)
target = np.array([2,1])
print "input=", input
print "target=", target
loss = DistKLDivCriterion()
out = loss.forward(input,target)
print "output of loss is :",out
grad_out = loss.backward(input,target)
print "grad out of loss is :",grad_out
Gives the output
input= [-1.14333924 0.97662296]
target= [2 1]
creating: createDistKLDivCriterion
output of loss is : 1.348175
grad out of loss is : [-1. -0.5]
ClassSimplexCriterion
Scala:
val criterion = ClassSimplexCriterion(nClasses)
Python:
criterion = ClassSimplexCriterion(nClasses)
ClassSimplexCriterion implements a criterion for classification. It learns an embedding per class, where each class' embedding is a point on an (N-1)-dimensional simplex, where N is the number of classes.
Parameters:
* nClasses
An integer, the number of classes.
Scala example:
import com.intel.analytics.bigdl.nn._
import com.intel.analytics.bigdl.utils.T
import com.intel.analytics.bigdl.tensor.Tensor
import com.intel.analytics.bigdl.tensor.TensorNumericMath.TensorNumeric.NumericFloat
val criterion = ClassSimplexCriterion(5)
val input = Tensor(T(
T(1.0f, 2.0f, 3.0f, 4.0f, 5.0f),
T(4.0f, 5.0f, 6.0f, 7.0f, 8.0f)
))
val target = Tensor(2)
target(1) = 2.0f
target(2) = 1.0f
val output = criterion.forward(input, target)
val gradient = criterion.backward(input, target)
-> print(output)
23.562702
-> print(gradient)
0.25 0.20635083 0.6 0.8 1.0
0.6 1.0 1.2 1.4 1.6
[com.intel.analytics.bigdl.tensor.DenseTensor of size 2x5]
Python example:
from bigdl.nn.layer import *
from bigdl.nn.criterion import *
import numpy as np
criterion = ClassSimplexCriterion(5)
input = np.array([
[1.0, 2.0, 3.0, 4.0, 5.0],
[4.0, 5.0, 6.0, 7.0, 8.0]
])
target = np.array([2.0, 1.0])
output = criterion.forward(input, target)
gradient = criterion.backward(input, target)
-> print output
23.562702
-> print gradient
[[ 0.25 0.20635083 0.60000002 0.80000001 1. ]
[ 0.60000002 1. 1.20000005 1.39999998 1.60000002]]
L1HingeEmbeddingCriterion
Scala:
val model = L1HingeEmbeddingCriterion(margin)
Python:
model = L1HingeEmbeddingCriterion(margin)
Creates a criterion that measures the loss given an input x = {x1, x2}
, a table of two Tensors, and a label y (1 or -1).
This is used for measuring whether two inputs are similar or dissimilar, using the L1 distance, and is typically used for learning nonlinear embeddings or semi-supervised learning.
⎧ ||x1 - x2||_1, if y == 1
loss(x, y) = ⎨
⎩ max(0, margin - ||x1 - x2||_1), if y == -1
The margin has a default value of 1, or can be set in the constructor.
Scala example:
import com.intel.analytics.bigdl.nn._
import com.intel.analytics.bigdl.tensor.Tensor
import com.intel.analytics.bigdl.utils.T
import com.intel.analytics.bigdl.tensor.TensorNumericMath.TensorNumeric.NumericFloat
val model = L1HingeEmbeddingCriterion(0.6)
val input1 = Tensor(T(1.0f, -0.1f))
val input2 = Tensor(T(2.0f, -0.2f))
val input = T(input1, input2)
val target = Tensor(1)
target(Array(1)) = 1.0f
val output = model.forward(input, target)
scala> print(output)
1.1
Python example:
model = L1HingeEmbeddingCriterion(0.6)
input1 = np.array(1.0, -0.1)
input2 = np.array(2.0, -0.2)
input = [input1, input2]
target = np.array([1.0])
output = model.forward(input, target)
>>> print output
1.1
CrossEntropyCriterion
Scala:
val module = CrossEntropyCriterion(weights, sizeAverage)
Python:
module = CrossEntropyCriterion(weights, sizeAverage)
This criterion combines LogSoftMax and ClassNLLCriterion in one single class.
weights
A tensor assigning weight to each of the classessizeAverage
whether to divide the sequence length. Default is true.
Scala example:
import com.intel.analytics.bigdl.nn._
import com.intel.analytics.bigdl.tensor.Tensor
import com.intel.analytics.bigdl.tensor.Storage
val layer = CrossEntropyCriterion[Double]()
val input = Tensor[Double](Storage(Array(
1.0262627674932,
-1.2412600935171,
-1.0423174168648,
-0.90330565804228,
-1.3686840144413,
-1.0778380454479,
-0.99131220658219,
-1.0559142847536,
-1.2692712660404
))).resize(3, 3)
val target = Tensor[Double](3)
target(Array(1)) = 1
target(Array(2)) = 2
target(Array(3)) = 3
> print(layer.forward(input, target))
0.9483051199107635
Python example:
from bigdl.nn.criterion import *
layer = CrossEntropyCriterion()
input = np.array([1.0262627674932,
-1.2412600935171,
-1.0423174168648,
-0.90330565804228,
-1.3686840144413,
-1.0778380454479,
-0.99131220658219,
-1.0559142847536,
-1.2692712660404
]).reshape(3,3)
target = np.array([1, 2, 3])
>layer.forward(input, target)
0.94830513
ParallelCriterion
Scala:
val pc = ParallelCriterion(repeatTarget=false)
Python:
pc = ParallelCriterion(repeat_target=False)
ParallelCriterion is a weighted sum of other criterions each applied to a different input and target. Set repeatTarget = true to share the target for criterions. Use add(criterion[, weight]) method to add criterion. Where weight is a scalar(default 1).
Scala example:
import com.intel.analytics.bigdl.utils.T
import com.intel.analytics.bigdl.tensor.{Tensor, Storage}
import com.intel.analytics.bigdl.tensor.TensorNumericMath.TensorNumeric.NumericFloat
import com.intel.analytics.bigdl.nn.{ParallelCriterion, ClassNLLCriterion, MSECriterion}
val pc = ParallelCriterion()
val input = T(Tensor(2, 10), Tensor(2, 10))
var i = 0
input[Tensor](1).apply1(_ => {i += 1; i})
input[Tensor](2).apply1(_ => {i -= 1; i})
val target = T(Tensor(Storage(Array(1.0f, 8.0f))), Tensor(2, 10).fill(1.0f))
val nll = ClassNLLCriterion()
val mse = MSECriterion()
pc.add(nll, 0.5).add(mse)
val output = pc.forward(input, target)
val gradInput = pc.backward(input, target)
println(output)
println(gradInput)
Gives the output,
100.75
Gives the gradInput,
{
2: 1.8000001 1.7 1.6 1.5 1.4 1.3000001 1.2 1.1 1.0 0.90000004
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 -0.1
[com.intel.analytics.bigdl.tensor.DenseTensor of size 2x10]
1: -0.25 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 -0.25 0.0 0.0
[com.intel.analytics.bigdl.tensor.DenseTensor of size 2x10]
}
Python example:
from bigdl.nn.layer import *
from bigdl.nn.criterion import *
from bigdl.optim.optimizer import *
from bigdl.util.common import *
pc = ParallelCriterion()
input1 = np.arange(1, 21, 1).astype("float32")
input2 = np.arange(0, 20, 1).astype("float32")[::-1]
input1 = input1.reshape(2, 10)
input2 = input2.reshape(2, 10)
input = [input1, input2]
target1 = np.array([1.0, 8.0]).astype("float32")
target1 = target1.reshape(2)
target2 = np.full([2, 10], 1).astype("float32")
target2 = target2.reshape(2, 10)
target = [target1, target2]
nll = ClassNLLCriterion()
mse = MSECriterion()
pc.add(nll, weight = 0.5).add(mse)
print "input = \n %s " % input
print "target = \n %s" % target
output = pc.forward(input, target)
gradInput = pc.backward(input, target)
print "output = %s " % output
print "gradInput = %s " % gradInput
Gives the output,
input =
[array([[ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.],
[ 11., 12., 13., 14., 15., 16., 17., 18., 19., 20.]], dtype=float32), array([[ 19., 18., 17., 16., 15., 14., 13., 12., 11., 10.],
[ 9., 8., 7., 6., 5., 4., 3., 2., 1., 0.]], dtype=float32)]
target =
[array([ 1., 8.], dtype=float32), array([[ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]], dtype=float32)]
output = 100.75
gradInput = [array([[-0.25, 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. , 0. , 0. , -0.25, 0. , 0. ]], dtype=float32), array([[ 1.80000007, 1.70000005, 1.60000002, 1.5 , 1.39999998,
1.30000007, 1.20000005, 1.10000002, 1. , 0.90000004],
[ 0.80000001, 0.69999999, 0.60000002, 0.5 , 0.40000001,
0.30000001, 0.2 , 0.1 , 0. , -0.1 ]], dtype=float32)]
MultiLabelMarginCriterion
Scala:
val multiLabelMarginCriterion = MultiLabelMarginCriterion(sizeAverage = true)
Python:
multiLabelMarginCriterion = MultiLabelMarginCriterion(size_average=True)
MultiLabelMarginCriterion creates a criterion that optimizes a multi-class multi-classification hinge loss (margin-based loss) between input x and output y
Scala example:
import com.intel.analytics.bigdl.tensor.TensorNumericMath.TensorNumeric.NumericFloat
import com.intel.analytics.bigdl.nn._
import com.intel.analytics.bigdl.tensor._
val multiLabelMarginCriterion = MultiLabelMarginCriterion(false)
val input = Tensor(4).rand()
val target = Tensor(4)
target(Array(1)) = 3
target(Array(2)) = 2
target(Array(3)) = 1
target(Array(4)) = 0
> print(input)
0.40267515
0.5913795
0.84936756
0.05999674
> print(multiLabelMarginCriterion.forward(input, target))
0.33414197
> print(multiLabelMarginCriterion.backward(input, target))
-0.25
-0.25
-0.25
0.75
[com.intel.analytics.bigdl.tensor.DenseTensor of size 4]
Python example:
from bigdl.nn.layer import *
multiLabelMarginCriterion = MultiLabelMarginCriterion(False)
> multiLabelMarginCriterion.forward(np.array([0.3, 0.4, 0.2, 0.6]), np.array([3, 2, 1, 0]))
0.975
> multiLabelMarginCriterion.backward(np.array([0.3, 0.4, 0.2, 0.6]), np.array([3, 2, 1, 0]))
[array([-0.25, -0.25, -0.25, 0.75], dtype=float32)]
MultiLabelSoftMarginCriterion
Scala:
val criterion = MultiLabelSoftMarginCriterion(weights = null, sizeAverage = true)
Python:
criterion = MultiLabelSoftMarginCriterion(weights=None, size_average=True)
MultiLabelSoftMarginCriterion is a multiLabel multiclass criterion based on sigmoid:
l(x,y) = - sum_i y[i] * log(p[i]) + (1 - y[i]) * log (1 - p[i])
where p[i] = exp(x[i]) / (1 + exp(x[i]))
If with weights,
l(x,y) = - sum_i weights[i] (y[i] * log(p[i]) + (1 - y[i]) * log (1 - p[i]))
Scala example:
import com.intel.analytics.bigdl.tensor.Tensor
import com.intel.analytics.bigdl.nn._
import com.intel.analytics.bigdl.tensor.TensorNumericMath.TensorNumeric.NumericFloat
val criterion = MultiLabelSoftMarginCriterion()
val input = Tensor(3)
input(Array(1)) = 0.4f
input(Array(2)) = 0.5f
input(Array(3)) = 0.6f
val target = Tensor(3)
target(Array(1)) = 0
target(Array(2)) = 1
target(Array(3)) = 1
> criterion.forward(input, target)
res0: Float = 0.6081934
Python example:
from bigdl.nn.criterion import *
import numpy as np
criterion = MultiLabelSoftMarginCriterion()
input = np.array([0.4, 0.5, 0.6])
target = np.array([0, 1, 1])
> criterion.forward(input, target)
0.6081934
AbsCriterion
Scala:
val criterion = AbsCriterion(sizeAverage)
Python:
criterion = AbsCriterion(sizeAverage)
Measures the mean absolute value of the element-wise difference between input and target
Scala example:
import com.intel.analytics.bigdl.nn._
import com.intel.analytics.bigdl.tensor.Tensor
import com.intel.analytics.bigdl.utils.T
import com.intel.analytics.bigdl.tensor.TensorNumericMath.TensorNumeric.NumericFloat
val criterion = AbsCriterion()
val input = Tensor(T(1.0f, 2.0f, 3.0f))
val target = Tensor(T(4.0f, 5.0f, 6.0f))
val output = criterion.forward(input, target)
scala> print(output)
3.0
Python example:
criterion = AbsCriterion()
input = np.array([1.0, 2.0, 3.0])
target = np.array([4.0, 5.0, 6.0])
output=criterion.forward(input, target)
>>> print output
3.0
MultiCriterion
Scala:
val criterion = MultiCriterion()
Python:
criterion = MultiCriterion()
MultiCriterion is a weighted sum of other criterions each applied to the same input and target
Scala example:
import com.intel.analytics.bigdl.tensor.Tensor
import com.intel.analytics.bigdl.nn._
import com.intel.analytics.bigdl.tensor.TensorNumericMath.TensorNumeric.NumericFloat
val criterion = MultiCriterion()
val nll = ClassNLLCriterion()
val mse = MSECriterion()
criterion.add(nll, 0.5)
criterion.add(mse)
val input = Tensor(5).randn()
val target = Tensor(5)
target(Array(1)) = 1
target(Array(2)) = 2
target(Array(3)) = 3
target(Array(4)) = 2
target(Array(5)) = 1
val output = criterion.forward(input, target)
> input
1.0641425
-0.33507252
1.2345984
0.08065767
0.531199
[com.intel.analytics.bigdl.tensor.DenseTensor$mcF$sp of size 5]
> output
res7: Float = 1.9633228
Python example:
from bigdl.nn.criterion import *
import numpy as np
criterion = MultiCriterion()
nll = ClassNLLCriterion()
mse = MSECriterion()
criterion.add(nll, 0.5)
criterion.add(mse)
input = np.array([0.9682213801388531,
0.35258855644097503,
0.04584479998452568,
-0.21781499692588918,
-1.02721844006879])
target = np.array([1, 2, 3, 2, 1])
output = criterion.forward(input, target)
> output
3.6099546