bigdl.dlframes package

Submodules

bigdl.dlframes.dl_classifier module

class bigdl.dlframes.dl_classifier.DLClassifier(model, criterion, feature_size, bigdl_type='float')[source]

Bases: bigdl.dlframes.dl_classifier.DLEstimator

class bigdl.dlframes.dl_classifier.DLClassifierModel(model, featureSize, jvalue=None, bigdl_type='float')[source]

Bases: bigdl.dlframes.dl_classifier.DLModel

classmethod of(jvalue, feature_size=None, bigdl_type='float')[source]
class bigdl.dlframes.dl_classifier.DLEstimator(model, criterion, feature_size, label_size, jvalue=None, bigdl_type='float')[source]

Bases: pyspark.ml.base.Estimator, pyspark.ml.param.shared.HasFeaturesCol, pyspark.ml.param.shared.HasLabelCol, pyspark.ml.param.shared.HasPredictionCol, bigdl.dlframes.dl_classifier.HasBatchSize, bigdl.dlframes.dl_classifier.HasMaxEpoch, bigdl.dlframes.dl_classifier.HasLearningRate, bigdl.util.common.JavaValue

class bigdl.dlframes.dl_classifier.DLModel(model, featureSize, jvalue=None, bigdl_type='float')[source]

Bases: pyspark.ml.base.Model, pyspark.ml.param.shared.HasFeaturesCol, pyspark.ml.param.shared.HasPredictionCol, bigdl.dlframes.dl_classifier.HasBatchSize, bigdl.dlframes.dl_classifier.HasFeatureSize, bigdl.util.common.JavaValue

classmethod of(jvalue, feature_size=None, bigdl_type='float')[source]
class bigdl.dlframes.dl_classifier.HasBatchSize[source]

Bases: pyspark.ml.param.Params

Mixin for param batchSize: batch size.

batchSize = Param(parent='undefined', name='batchSize', doc='batchSize')

param for batch size.

getBatchSize()[source]

Gets the value of batchSize or its default value.

setBatchSize(val)[source]

Sets the value of batchSize.

class bigdl.dlframes.dl_classifier.HasFeatureSize[source]

Bases: pyspark.ml.param.Params

featureSize = Param(parent='undefined', name='featureSize', doc='size of the feature')
getFeatureSize()[source]
setFeatureSize(val)[source]
class bigdl.dlframes.dl_classifier.HasLearningRate[source]

Bases: pyspark.ml.param.Params

getLearningRate()[source]

Gets the value of maxEpoch or its default value.

learningRate = Param(parent='undefined', name='learningRate', doc='learning rate')
setLearningRate(val)[source]
class bigdl.dlframes.dl_classifier.HasMaxEpoch[source]

Bases: pyspark.ml.param.Params

getMaxEpoch()[source]

Gets the value of maxEpoch or its default value.

maxEpoch = Param(parent='undefined', name='maxEpoch', doc='number of max Epoch')
setMaxEpoch(val)[source]

bigdl.dlframes.dl_image_reader module

class bigdl.dlframes.dl_image_reader.DLImageReader[source]

Primary DataFrame-based image loading interface, defining API to read images from files to DataFrame.

static readImages(sc=None, minParitions=1, bigdl_type='float')[source]

Read the directory of images into DataFrame from the local or remote source. :param path Directory to the input data files, the path can be comma separated paths as the list of inputs. Wildcards path are supported similarly to sc.binaryFiles(path). :param min_partitions A suggestion value of the minimal splitting number for input data. :return DataFrame with a single column “image”; Each record in the column represents one image record: Row (uri, height, width, channels, CvType, bytes)

bigdl.dlframes.dl_image_transformer module

class bigdl.dlframes.dl_image_transformer.DLImageTransformer(transformer, jvalue=None, bigdl_type='float')[source]

Bases: pyspark.ml.wrapper.JavaTransformer, pyspark.ml.param.shared.HasInputCol, pyspark.ml.param.shared.HasOutputCol, bigdl.util.common.JavaValue

Provides DataFrame-based API for image pre-processing and feature transformation. DLImageTransformer follows the Spark Transformer API pattern and can be used as one stage in Spark ML pipeline.

The input column can be either DLImageSchema.byteSchema or DLImageSchema.floatSchema. If using DLImageReader, the default format is DLImageSchema.byteSchema The output column is always DLImageSchema.floatSchema.

transform(dataset)[source]

Apply the transformer to the images in “inputCol” and store the transformed result into “outputCols”

Module contents