Train, evaluate or predict a model

This page shows how to train, evaluate or predict a model using the Keras-Style API.

You may refer to the User Guide page to see how to define a model in Python or Scala correspondingly.

You may refer to Layers section to find all the available layers.

After defining a model with the Keras-Style API, you can call the following methods on the model:

Compile

Configure the learning process. Must be called before fit or evaluate.

Scala:

compile(optimizer, loss, metrics = null)

Python

compile(optimizer, loss, metrics=None)

Parameters:

optimizer: Optimization method to be used. Can either use the string representation of an optimization method (see here) or an instance of OptimMethod.
loss: Criterion to be used. Can either use the string representation of a criterion (see here) or an instance of Loss.
metrics: One or more validation methods to be used. Default is null if no validation needs to be configured. Can either use the string representation Array("accuracy")(Scala) ["accuracy"](Python) or instances of ValidationMethod.

Train a model for a fixed number of epochs on a dataset. Need to first compile the model beforehand.

Scala:

fit(x, nbEpoch = 10, validationData = null)

Python

fit(x, y=None, batch_size=32, nb_epoch=10, validation_data=None, distributed=True)

Parameters:

x: Training dataset.
batchSize: Number of samples per gradient update.
nbEpoch: Number of iterations to train.
validationData: Dataset for validation. Default is null if validation is not configured.

Remark

For Scala, x can either be RDD of Sample (specifying batchSize) or an instance of DataSet.
For Python, you can use x (a Numpy array) as features with y (a Numpy array) as labels; or only x (RDD of Sample) without specifying y.
The parameter distributed is to choose whether to train the model using distributed mode or local mode in Python. Default is true. If in local mode, x and y must both be Numpy arrays.

Evaluate a model on a given dataset using the metrics specified when you compile the model.

Scala:

evaluate(x)

Python

evaluate(x, y=None, batch_size=32)

Parameters:

Remark

For Scala, x can either be RDD of Sample (specifying batchSize) or an instance of DataSet.
For Python, you can use x (a Numpy array) as features with y (a Numpy array) as labels; or only x (RDD of Sample) without specifying y. Currently only evaluation in distributed mode is supported in Python.

Use a model to do prediction.

Scala:

predict(x)

Python

predict(x, distributed=True)

Parameters:

Remark

For Scala, x can either be RDD of Sample (specifying batchSize) or an instance of LocalDataSet.
For Python, x can either be a Numpy array representing labels or RDD of Sample.
The parameter distributed is to choose whether to do prediction using distributed mode or local mode in Python. Default is true. If in local mode, x must be a Numpy array.