Recurrent Layers
SimpleRNN
A fully-connected recurrent neural network cell. The output is to be fed back to input.
The input of this layer should be 3D, i.e. (batch, time steps, input dim).
Scala:
SimpleRNN(outputDim, activation = "tanh", returnSequences = false, goBackwards = false, wRegularizer = null, uRegularizer = null, bRegularizer = null, inputShape = null)
Python:
SimpleRNN(output_dim, activation="tanh", return_sequences=False, go_backwards=False, W_regularizer=None, U_regularizer=None, b_regularizer=None, input_shape=None, name=None)
Parameters:
outputDim
: Hidden unit size. Dimension of internal projections and final output.activation
: String representation of the activation function to use. See here for available activation strings. Default is 'tanh'.returnSequences
: Whether to return the full sequence or only return the last output in the output sequence. Default is false.goBackwards
: Whether the input sequence will be processed backwards. Default is false.wRegularizer
: An instance of Regularizer, (eg. L1 or L2 regularization), applied to the input weights matrices. Default is null.uRegularizer
: An instance of Regularizer, applied the recurrent weights matrices. Default is null.bRegularizer
: An instance of Regularizer, applied to the bias. Default is null.inputShape
: Only need to specify this argument when you use this layer as the first layer of a model. For Scala API, it should be aShape
object. For Python API, it should be a shape tuple. Batch dimension should be excluded.
Scala example:
import com.intel.analytics.bigdl.nn.keras.{Sequential, SimpleRNN}
import com.intel.analytics.bigdl.utils.Shape
import com.intel.analytics.bigdl.tensor.Tensor
val model = Sequential[Float]()
model.add(SimpleRNN(8, activation = "relu", inputShape = Shape(4, 5)))
val input = Tensor[Float](2, 4, 5).randn()
val output = model.forward(input)
Input is:
input: com.intel.analytics.bigdl.tensor.Tensor[Float] =
(1,.,.) =
0.71328646 0.24269831 -0.75013286 -1.6663225 0.35494477
0.073439054 -1.1181073 -0.6577777 1.3154761 0.15396282
0.41183218 -1.2667576 -0.11167632 0.946616 0.06427766
0.013886308 -0.20620999 1.1173447 1.9083043 1.7680032
(2,.,.) =
-2.3510098 -0.8492037 0.042268332 -0.43801674 -0.010638754
1.298793 -0.24814601 0.31325665 -0.19119295 -2.072075
-0.11629801 0.27296612 0.94443846 0.37293285 -0.82289046
0.6044998 0.93386084 -1.3502276 -1.7753356 1.6173482
[com.intel.analytics.bigdl.tensor.DenseTensor of size 2x4x5]
Output is:
output: com.intel.analytics.bigdl.nn.abstractnn.Activity =
0.0 0.020557694 0.0 0.39700085 0.622244 0.0 0.36524248 0.88961613
0.0 1.4797685 0.0 0.0 0.0 0.0 0.0 0.0
[com.intel.analytics.bigdl.tensor.DenseTensor of size 2x8]
Python example:
import numpy as np
from bigdl.nn.keras.topology import Sequential
from bigdl.nn.keras.layer import SimpleRNN
model = Sequential()
model.add(SimpleRNN(8, activation = "relu", input_shape = (4, 5)))
input = np.random.random([2, 4, 5])
output = model.forward(input)
Input is:
[[[0.43400622 0.65452575 0.94952774 0.96210478 0.05286231]
[0.2162183 0.33225502 0.09725628 0.80813221 0.29556109]
[0.19720487 0.35077585 0.80904872 0.80576513 0.82035253]
[0.36175687 0.63291153 0.08437936 0.71581099 0.790709 ]]
[[0.35387003 0.36532078 0.9834315 0.07562338 0.05600369]
[0.65927201 0.14652252 0.10848068 0.88225065 0.88871385]
[0.23627135 0.72620104 0.60391828 0.51571874 0.73550574]
[0.80773506 0.35121494 0.66889362 0.530684 0.52066982]]]
Output is:
[[0.77534926 0.23742369 0.14946866 0.0 0.16289112 0.0 0.71689016 0.24594748]
[0.8987881 0.06123672 0.3312829 0.29757586 0.0 0.0 1.0179179 0.23447856]]
LSTM
Long Short Term Memory unit architecture.
The input of this layer should be 3D, i.e. (batch, time steps, input dim).
Scala:
LSTM(outputDim, activation = "tanh", innerActivation = "hard_sigmoid", returnSequences = false, goBackwards = false, wRegularizer = null, uRegularizer = null, bRegularizer = null, inputShape = null)
Python:
LSTM(output_dim, activation="tanh", inner_activation="hard_sigmoid", return_sequences=False, go_backwards=False, W_regularizer=None, U_regularizer=None, b_regularizer=None, input_shape=None, input_shape=None, name=None)
Parameters:
outputDim
: Hidden unit size. Dimension of internal projections and final output.activation
: String representation of the activation function to use. See here for available activation strings. Default is 'tanh'.innerActivation
: String representation of the activation function for inner cells. See here for available activation strings. Default is 'hard_sigmoid'.returnSequences
: Whether to return the full sequence or only return the last output in the output sequence. Default is false.goBackwards
: Whether the input sequence will be processed backwards. Default is false.wRegularizer
: An instance of Regularizer, (eg. L1 or L2 regularization), applied to the input weights matrices. Default is null.uRegularizer
: An instance of Regularizer, applied the recurrent weights matrices. Default is null.bRegularizer
: An instance of Regularizer, applied to the bias. Default is null.inputShape
: Only need to specify this argument when you use this layer as the first layer of a model. For Scala API, it should be aShape
object. For Python API, it should be a shape tuple. Batch dimension should be excluded.
Scala example:
import com.intel.analytics.bigdl.nn.keras.{Sequential, LSTM}
import com.intel.analytics.bigdl.utils.Shape
import com.intel.analytics.bigdl.tensor.Tensor
val model = Sequential[Float]()
model.add(LSTM(8, inputShape = Shape(2, 3)))
val input = Tensor[Float](2, 2, 3).randn()
val output = model.forward(input)
Input is:
input: com.intel.analytics.bigdl.tensor.Tensor[Float] =
(1,.,.) =
1.3485646 0.38385049 0.676986
0.13189854 0.30926105 0.4539456
(2,.,.) =
-1.7166822 -0.71257055 -0.477679
-0.36572325 -0.5534503 -0.018431915
[com.intel.analytics.bigdl.tensor.DenseTensor of size 2x2x3]
Output is:
output: com.intel.analytics.bigdl.nn.abstractnn.Activity =
-0.20168768 -0.20359062 -0.11801678 -0.08987579 0.20480658 -0.05170132 -0.048530716 0.08447949
-0.07134238 -0.11233686 0.073534355 0.047955263 0.13415548 0.12862797 -0.07839044 0.28296617
[com.intel.analytics.bigdl.tensor.DenseTensor of size 2x8]
Python example:
import numpy as np
from bigdl.nn.keras.topology import Sequential
from bigdl.nn.keras.layer import LSTM
model = Sequential()
model.add(LSTM(8, input_shape = (2, 3)))
input = np.random.random([2, 2, 3])
output = model.forward(input)
Input is:
[[[0.84004043 0.2081865 0.76093342]
[0.06878797 0.13804673 0.23251666]]
[[0.24651173 0.5650254 0.41424478]
[0.49338729 0.40505622 0.01497762]]]
Output is:
[[ 0.01089199 0.02563154 -0.04335827 0.03037791 0.11265078 -0.17756112
0.14166507 0.01017009]
[ 0.0144811 0.03360332 0.00676281 -0.01473055 0.09639315 -0.16620669
0.07391933 0.01746811]]
GRU
Gated Recurrent Unit architecture.
The input of this layer should be 3D, i.e. (batch, time steps, input dim).
Scala:
GRU(outputDim, activation = "tanh", innerActivation = "hard_sigmoid", returnSequences = false, goBackwards = false, wRegularizer = null, uRegularizer = null, bRegularizer = null, inputShape = null)
Python:
GRU(output_dim, activation="tanh", inner_activation="hard_sigmoid", return_sequences=False, go_backwards=False, W_regularizer=None, U_regularizer=None, b_regularizer=None, input_shape=None, name=None)
Parameters:
outputDim
: Hidden unit size. Dimension of internal projections and final output.activation
: String representation of the activation function to use. See here for available activation strings. Default is 'tanh'.innerActivation
: String representation of the activation function for inner cells. See here for available activation strings. Default is 'hard_sigmoid'.returnSequences
: Whether to return the full sequence or only return the last output in the output sequence. Default is false.goBackwards
: Whether the input sequence will be processed backwards. Default is false.wRegularizer
: An instance of Regularizer, (eg. L1 or L2 regularization), applied to the input weights matrices. Default is null.uRegularizer
: An instance of Regularizer, applied the recurrent weights matrices. Default is null.bRegularizer
: An instance of Regularizer, applied to the bias. Default is null.inputShape
: Only need to specify this argument when you use this layer as the first layer of a model. For Scala API, it should be aShape
object. For Python API, it should be a shape tuple. Batch dimension should be excluded.
Scala example:
import com.intel.analytics.bigdl.nn.keras.{Sequential, GRU}
import com.intel.analytics.bigdl.utils.Shape
import com.intel.analytics.bigdl.tensor.Tensor
val model = Sequential[Float]()
model.add(GRU(8, inputShape = Shape(2, 3)))
val input = Tensor[Float](2, 2, 3).randn()
val output = model.forward(input)
Input is:
input: com.intel.analytics.bigdl.tensor.Tensor[Float] =
(1,.,.) =
-0.010477358 -1.1201298 -0.86472356
0.12688802 -0.6696582 0.08027417
(2,.,.) =
0.1724209 -0.52319324 -0.8808063
0.17918338 -0.552886 -0.11891741
[com.intel.analytics.bigdl.tensor.DenseTensor of size 2x2x3]
Output is:
output: com.intel.analytics.bigdl.nn.abstractnn.Activity =
-0.12018716 -0.31560755 0.2867627 0.6728765 0.13287778 0.2112865 0.13381396 -0.4267934
-0.18521798 -0.30512968 0.14875418 0.63962734 0.1841841 0.25272882 0.016909363 -0.38463163
[com.intel.analytics.bigdl.tensor.DenseTensor of size 2x8]
Python example:
import numpy as np
from bigdl.nn.keras.topology import Sequential
from bigdl.nn.keras.layer import GRU
model = Sequential()
model.add(GRU(8, input_shape = (2, 3)))
input = np.random.random([2, 2, 3])
output = model.forward(input)
Input is:
[[[0.25026651 0.35433442 0.01417391]
[0.77236921 0.97315472 0.66090386]]
[[0.76037554 0.41029034 0.68725938]
[0.17888889 0.67670088 0.70580547]]]
Output is:
[[-0.03584666 0.07984452 -0.06159414 -0.13331707 0.34015405 -0.07107028 0.12444386 -0.06606203]
[ 0.02881907 0.04856917 -0.15306929 -0.24991018 0.23814955 0.0303434 0.06634206 -0.15335503]]
Highway
Densely connected highway network.
Highway layers are a natural extension of LSTMs to feedforward networks.
The input of this layer should be 2D, i.e. (batch, input dim).
Scala:
Highway(activation = null, wRegularizer = null, bRegularizer = null, bias = true, inputShape = null)
Python:
Highway(activation=None, W_regularizer=None, b_regularizer=None, bias=True, input_shape=None, name=None)
Parameters:
activation
: String representation of the activation function to use. See here for available activation strings. Default is null.wRegularizer
: An instance of Regularizer, (eg. L1 or L2 regularization), applied to the input weights matrices. Default is null.bRegularizer
: An instance of Regularizer, applied to the bias. Default is null.bias
: Whether to include a bias (i.e. make the layer affine rather than linear). Default is true.inputShape
: Only need to specify this argument when you use this layer as the first layer of a model. For Scala API, it should be aShape
object. For Python API, it should be a shape tuple. Batch dimension should be excluded.
Scala example:
import com.intel.analytics.bigdl.nn.keras.{Sequential, Highway}
import com.intel.analytics.bigdl.utils.Shape
import com.intel.analytics.bigdl.tensor.Tensor
val model = Sequential[Float]()
model.add(Highway(inputShape = Shape(3)))
val input = Tensor[Float](2, 3).randn()
val output = model.forward(input)
Input is:
input: com.intel.analytics.bigdl.tensor.Tensor[Float] =
-0.26041138 0.4286919 1.723103
1.4516269 0.5557163 -0.1149741
[com.intel.analytics.bigdl.tensor.DenseTensor of size 2x3]
Output is:
output: com.intel.analytics.bigdl.nn.abstractnn.Activity =
-0.006746907 -0.109112576 1.3375516
0.6065166 0.41575465 -0.06849813
[com.intel.analytics.bigdl.tensor.DenseTensor of size 2x3]
Python example:
import numpy as np
from bigdl.nn.keras.topology import Sequential
from bigdl.nn.keras.layer import Highway
model = Sequential()
model.add(Highway(input_shape = (3)))
input = np.random.random([2, 3])
output = model.forward(input)
Input is:
[[0.5762107 0.45679288 0.00370956]
[0.24133312 0.38104653 0.05249192]]
Output is:
[[0.5762107 0.4567929 0.00370956]
[0.24133313 0.38104653 0.05249191]]
ConvLSTM2D
Convolutional LSTM.
Data format currently supported for this layer is 'CHANNEL_FIRST' (dimOrdering='th').
Border mode currently supported for this layer is 'same'.
The convolution kernel for this layer is a square kernel with equal strides 'subsample'.
The input of this layer should be 5D.
Scala:
ConvLSTM2D(nbFilter, nbKernel, activation = "tanh", innerActivation = "hard_sigmoid", dimOrdering = "th", subsample = 1, wRegularizer = null, uRegularizer = null, bRegularizer = null, returnSequences = false, goBackwards = false, inputShape = null)
Python:
ConvLSTM2D(nb_filter, nb_row, nb_col, activation="tanh", inner_activation="hard_sigmoid", dim_ordering="th", border_mode="same", subsample=(1, 1), W_regularizer=None, U_regularizer=None, b_regularizer=None, return_sequences=False, go_backwards=False, input_shape=None, name=None)
Parameters:
nbFilter
: Number of convolution filters to use.nbKernel
: Number of rows/columns in the convolution kernel. Square kernel. In Python, require nb_row==nb_col.activation
: String representation of the activation function to use. See here for available activation strings. Default is 'tanh'.innerActivation
: String representation of the activation function to use for inner cells. See here for available activation strings. Default is 'hard_sigmoid'.dimOrdering
: Format of input data. Only 'th' (Channel First) is supported for now.subsample
: Factor by which to subsample output. Also called strides elsewhere.wRegularizer
: An instance of Regularizer, (eg. L1 or L2 regularization), applied to the input weights matrices. Default is null.uRegularizer
: An instance of Regularizer, (eg. L1 or L2 regularization), applied to the recurrent weights matrices. Default is null.bRegularizer
: An instance of Regularizer, applied to the bias. Default is null.returnSequences
: Whether to return the full sequence or the last output in the output sequence. Default is false.goBackwards
: Whether the input sequence will be processed backwards. Default is false.inputShape
: Only need to specify this argument when you use this layer as the first layer of a model. For Scala API, it should be aShape
object. For Python API, it should be a shape tuple. Batch dimension should be excluded.
Scala example:
import com.intel.analytics.bigdl.nn.keras.{Sequential, ConvLSTM2D}
import com.intel.analytics.bigdl.utils.Shape
import com.intel.analytics.bigdl.tensor.Tensor
val model = Sequential[Float]()
model.add(ConvLSTM2D(2, 2, inputShape = Shape(1, 2, 2, 2)))
val input = Tensor[Float](1, 1, 2, 2, 2).randn()
val output = model.forward(input)
Input is:
input: com.intel.analytics.bigdl.tensor.Tensor[Float] =
(1,1,1,.,.) =
-0.3935159 -2.0734277
0.16473202 -1.0574125
(1,1,2,.,.) =
1.2325795 0.510846
-0.4246685 -0.109434046
[com.intel.analytics.bigdl.tensor.DenseTensor of size 1x1x2x2x2]
Output is:
output: com.intel.analytics.bigdl.nn.abstractnn.Activity =
(1,1,.,.) =
-0.12613402 0.035963967
0.046498444 0.03568305
(1,2,.,.) =
-0.1547083 -0.046905644
-0.115438126 -0.08817647
[com.intel.analytics.bigdl.tensor.DenseTensor of size 1x2x2x2]
Python example:
import numpy as np
from bigdl.nn.keras.topology import Sequential
from bigdl.nn.keras.layer import ConvLSTM2D
model = Sequential()
model.add(ConvLSTM2D(2, 2, 2, input_shape=(1, 2, 2, 2)))
input = np.random.random([1, 1, 2, 2, 2])
output = model.forward(input)
Input is:
[[[[[0.53293431 0.02606896]
[0.50916001 0.6927234 ]]
[[0.44282168 0.05963464]
[0.22863441 0.45312165]]]]]
Output is
[[[[ 0.09322705 0.09817358]
[ 0.12197719 0.11264911]]
[[ -0.03922357 -0.11715978]
[ -0.01915754 -0.03141996]]]]