Optimizer cache some metadata on each executor
Optimizer cache some metadata on each executor
Tensor element type
* Create checkpoint.
* Create checkpoint.
cache trigger
cache path
whether over write
wall clock time
cached models
state table
all reduce parameters
all optim methods
training model
Fetch current model parameters to driver, and copy to trainingModel.
Fetch current model parameters to driver, and copy to trainingModel.
cached models
AllReduceParameter
the model is trained by optimizer
trained model
Save train summaries.
Save train summaries.
train logger
cached models
driver state
AllReduceParameter
Validate current model and save the result.
Validate current model and save the result.
validation trigger
validation dataset
validation methods
cores per node
cached models
state table
validation logger.
log header string