Getting Started
Before using BigDL
Before using BigDL, you need to install Apache Spark and obtain BigDL libraries. Then in your program, you need to ensure the SparkContext is created successfully and initialize BigDL engine before calling BigDL APIs. Navigate to Scala User Guide/Install or Python User Guide/Install for details about how to install BigDL, and Scala User Guide/Run or Python User Guide/Run for how to run programs.
Prepare your Data
Your data need to be transformed into RDD of Sample in order to be fed into BigDL for training, evaluation and prediction (also refer to Optimization and Optimizer API guide).
Tensor, Table are essential data structures that composes the basic dataflow inside the neural network( e.g. input/output, gradients, weights, etc.). You will need to understand them to get a better idea of layer behaviors.
Use BigDL for Prediction only
If you have an existing model and want to use BigDL only for prediction, you need first load the model, and then do prediction or evaluation.
BigDL supports loading models trained and saved in BigDL, or a trained Tensorflow, Caffe or Keras model.
- To load a BigDL model, you can use
Module.load
interface (Scala) orModel.load
(in Python). Refer to Model Save for details. - To load a Tensorflow model, refer to Tensorflow Support for details.
- To load a Caffe model, refer to Caffe Support for details.
- To load a Keras model, refer to Keras Support for details.
Refer to Model Predict for details about how to use a model for prediction.
If you are using the trained model as a component inside a Spark ML pipeline, refer to Using BigDL in Spark ML Pipeline page for usage.
Train a Model from Scratch
The procedure of training a model from scratch usually involves following steps:
- define your model (by connecting layers/activations into a network)
- decide your loss function (which function to optimize)
- optimization (choose a proper algorithm and hyper parameters, and train)
- evaluation (evaluate your model)
Before training models, please make sure BigDL is installed, BigDL engine initialized properly, and your data is in proper format. Refer to Before using BigDL and Prepare Your Data for details.
The most recommended way to create your first model is to modify from an existing one. BigDL provides plenty of models for you to refer to. See Scala Models/Examples and Python Models/Examples and Tutorials.
To define a model, you can either use the Sequential API or Functional API. The Functional API is more flexible than Sequential API. Refer to Sequential API and Functional API for how to define models in different shapes. Navigate to API Guide/Layers on the side bar to find the documenations of available layers and activation.
After creating the model, you will have to decide which loss function to use in training. Find the details of losses defined in BigDL in Losses.
Now you create an Optimizer
and set the loss function, input dataset along with other hyper parameters into the Optimizer. Then call Optimizer.optimize
to train. Refer to Optimization and Optimizer API guide for details.
Model Evaluation can be performed periodically during a training. Refer to Validate your Model in Training for details. For a list of defined metrics, refer to Metrics.
When Optimizer.optimize
finishes, it will return a trained model. You can then use the trained model for prediction or evaluation. Refer to Model Prediction and Model Evaluation for detailed usage.
If you prefer to train a model inside a Spark ML pipeline, please refer to Using BigDL in Spark ML Pipeline page for usage.
Save a Model
When training is finished, you may need to save the final model for later use.
BigDL allows you to save your BigDL model on local filesystem, HDFS, or Amazon s3 (refer to Model Save).
You may also save the model to Tensorflow or Caffe format (refer to Caffe Support, and Tensorflow Support respectively).
Stop and Resume a Training
Training a deep learning model sometimes takes a very long time. It may be stopped or interrupted and we need the training to resume from where we have left.
To enable this, you have to configure Optimizer
to periodically take snapshots of the model (trained weights, biases, etc.) and optim-method (configurations and states of the optimization) and dump them into files. Refer to Checkpointing for details.
To resume a training after it stops, refer to Resume Training.
Use Pre-trained Models/Layers
Pre-train is a useful strategy when training deep learning models. You may use the pre-trained features (e.g. embeddings) in your model, or do a fine-tuning for a different dataset or target.
To use a learnt model as a whole, you can use Module.load
to load the entire model, Then create an Optimizer
with the loaded model set into it. Refer to Optmizer API and Module API for details.
Instead of using an entire model, you can also use pre-trained weights/biases in certain layers. After a layer is created, use setWeightsBias
(in Scala) or set_weights
(in Python) on the layer to initialize the weights with pre-trained weights. Then continue to train your model as usual.
Monitor your training
BigDL provides a convenient way to monitor/visualize your training progress. It writes the statistics collected during training/validation and they can be visualized in real-time using tensorboard. These statistics can also be retrieved into readable data structures later and visualized in other tools (e.g. Jupyter notebook). For details, refer to Visualization.
Tuning
There're several strategies that may be useful when tuning an optimization.
- Change the learning Rate Schedule in SGD. Refer to SGD docs for details.
- If overfit is seen, try use Regularization. Refer to Regularizers.
- Try change the initialization methods. Refer to Initailizers.
- Try Adam or Adagrad at the first place. If they can't achieve a good score, use SGD and find a proper learning rate schedule - it usually takes time, though. RMSProp is recommended for RNN models. Refer to Optimization Algorithms for a list of supported optimization methods.