bigdl.util package

Submodules

bigdl.util.common module

class bigdl.util.common.EvaluatedResult(result, total_num, method)[source]

A testing result used to benchmark the model quality.

class bigdl.util.common.GatewayWrapper(bigdl_type, port=25333)[source]

Bases: bigdl.util.common.SingletonMixin

class bigdl.util.common.JActivity(value)[source]

Bases: object

class bigdl.util.common.JTensor(storage, shape, bigdl_type='float', indices=None)[source]

Bases: object

A wrapper to easy our work when need to pass or return Tensor to/from Scala.

>>> import numpy as np
>>> from bigdl.util.common import JTensor
>>> np.random.seed(123)
>>>
classmethod from_ndarray(a_ndarray, bigdl_type='float')[source]

Convert a ndarray to a DenseTensor which would be used in Java side.

>>> import numpy as np
>>> from bigdl.util.common import JTensor
>>> from bigdl.util.common import callBigDlFunc
>>> np.random.seed(123)
>>> data = np.random.uniform(0, 1, (2, 3)).astype("float32")
>>> result = JTensor.from_ndarray(data)
>>> expected_storage = np.array([[0.69646919, 0.28613934, 0.22685145], [0.55131477, 0.71946895, 0.42310646]])
>>> expected_shape = np.array([2, 3])
>>> np.testing.assert_allclose(result.storage, expected_storage, rtol=1e-6, atol=1e-6)
>>> np.testing.assert_allclose(result.shape, expected_shape)
>>> data_back = result.to_ndarray()
>>> (data == data_back).all()
True
>>> tensor1 = callBigDlFunc("float", "testTensor", JTensor.from_ndarray(data))  # noqa
>>> array_from_tensor = tensor1.to_ndarray()
>>> (array_from_tensor == data).all()
True
classmethod sparse(a_ndarray, i_ndarray, shape, bigdl_type='float')[source]

Convert a three ndarray to SparseTensor which would be used in Java side. For example: a_ndarray = [1, 3, 2, 4] i_ndarray = [[0, 0, 1, 2], [0, 3, 2, 1]] shape = [3, 4] Present a dense tensor [[ 1, 0, 0, 3], [ 0, 0, 2, 0], [ 0, 4, 0, 0]]

:param a_ndarray non-zero elements in this SparseTensor :param i_ndarray zero-based indices for non-zero element i_ndarray’s shape should be (shape.size, a_ndarray.size) And the i-th non-zero elements indices is i_ndarray[:, 1], should be zero-based and ascending; :param shape shape as a DenseTensor.

>>> import numpy as np
>>> from bigdl.util.common import JTensor
>>> from bigdl.util.common import callBigDlFunc
>>> np.random.seed(123)
>>> data = np.arange(1, 7).astype("float32")
>>> indices = np.arange(1, 7)
>>> shape = np.array([10])
>>> result = JTensor.sparse(data, indices, shape)
>>> expected_storage = np.array([1., 2., 3., 4., 5., 6.])
>>> expected_shape = np.array([10])
>>> expected_indices = np.array([1, 2, 3, 4, 5, 6])
>>> np.testing.assert_allclose(result.storage, expected_storage)
>>> np.testing.assert_allclose(result.shape, expected_shape)
>>> np.testing.assert_allclose(result.indices, expected_indices)
>>> tensor1 = callBigDlFunc("float", "testTensor", result)  # noqa
>>> array_from_tensor = tensor1.to_ndarray()
>>> expected_ndarray = np.array([0, 1, 2, 3, 4, 5, 6, 0, 0, 0])
>>> (array_from_tensor == expected_ndarray).all()
True
to_ndarray()[source]

Transfer JTensor to ndarray. As SparseTensor may generate an very big ndarray, so we don’t support this function for SparseTensor. :return: a ndarray

class bigdl.util.common.JavaCreator(bigdl_type, gateway)[source]

Bases: bigdl.util.common.SingletonMixin

classmethod add_creator_class(jinvoker)[source]
classmethod get_creator_class()[source]
classmethod set_creator_class(cclass)[source]
class bigdl.util.common.JavaValue(jvalue, bigdl_type, *args)[source]

Bases: object

jvm_class_constructor()[source]
class bigdl.util.common.RNG(bigdl_type='float')[source]

generate tensor data with seed

set_seed(seed)[source]
uniform(a, b, size)[source]
class bigdl.util.common.Sample(features, labels, bigdl_type='float')[source]

Bases: object

classmethod from_jtensor(features, labels, bigdl_type='float')[source]

Convert a sequence of JTensor to Sample, which would be used in Java side. :param features: an JTensor or a list of JTensor :param labels: an JTensor or a list of JTensor or a scalar :param bigdl_type: “double” or “float”

>>> import numpy as np
>>> data = np.random.uniform(0, 1, (6)).astype("float32")
>>> indices = np.arange(1, 7)
>>> shape = np.array([10])
>>> feature0 = JTensor.sparse(data, indices, shape)
>>> feature1 = JTensor.from_ndarray(np.random.uniform(0, 1, (2, 3)).astype("float32"))
>>> sample = Sample.from_jtensor([feature0, feature1], 1)
classmethod from_ndarray(features, labels, bigdl_type='float')[source]

Convert a ndarray of features and labels to Sample, which would be used in Java side. :param features: an ndarray or a list of ndarrays :param labels: an ndarray or a list of ndarrays or a scalar :param bigdl_type: “double” or “float”

>>> import numpy as np
>>> from bigdl.util.common import callBigDlFunc
>>> from numpy.testing import assert_allclose
>>> np.random.seed(123)
>>> sample = Sample.from_ndarray(np.random.random((2,3)), np.random.random((2,3)))
>>> sample_back = callBigDlFunc("float", "testSample", sample)
>>> assert_allclose(sample.features[0].to_ndarray(), sample_back.features[0].to_ndarray())
>>> assert_allclose(sample.label.to_ndarray(), sample_back.label.to_ndarray())
>>> expected_feature_storage = np.array(([[0.69646919, 0.28613934, 0.22685145], [0.55131477, 0.71946895, 0.42310646]]))
>>> expected_feature_shape = np.array([2, 3])
>>> expected_label_storage = np.array(([[0.98076421, 0.68482971, 0.48093191], [0.39211753, 0.343178, 0.72904968]]))
>>> expected_label_shape = np.array([2, 3])
>>> assert_allclose(sample.features[0].storage, expected_feature_storage, rtol=1e-6, atol=1e-6)
>>> assert_allclose(sample.features[0].shape, expected_feature_shape)
>>> assert_allclose(sample.labels[0].storage, expected_label_storage, rtol=1e-6, atol=1e-6)
>>> assert_allclose(sample.labels[0].shape, expected_label_shape)
class bigdl.util.common.SingletonMixin[source]

Bases: object

classmethod instance(bigdl_type, *args)[source]
bigdl.util.common.callBigDlFunc(bigdl_type, name, *args)[source]

Call API in PythonBigDL

bigdl.util.common.callJavaFunc(func, *args)[source]

Call Java Function

bigdl.util.common.create_spark_conf()[source]
bigdl.util.common.create_tmp_path()[source]
bigdl.util.common.extend_spark_driver_cp(sparkConf, path)[source]
bigdl.util.common.get_activation_by_name(activation_name, activation_id=None)[source]

Convert to a bigdl activation layer given the name of the activation as a string

bigdl.util.common.get_bigdl_conf()[source]
bigdl.util.common.get_bigdl_engine_type(bigdl_type='float')[source]
bigdl.util.common.get_dtype(bigdl_type)[source]
bigdl.util.common.get_local_file(a_path)[source]
bigdl.util.common.get_node_and_core_number(bigdl_type='float')[source]
bigdl.util.common.get_optimizer_version(bigdl_type='float')[source]
bigdl.util.common.get_spark_context(conf=None)[source]

Get the current active spark context and create one if no active instance :param conf: combining bigdl configs into spark conf :return: SparkContext

bigdl.util.common.get_spark_sql_context(sc)[source]
bigdl.util.common.init_engine(bigdl_type='float')[source]
bigdl.util.common.init_executor_gateway(sc, bigdl_type='float')[source]
bigdl.util.common.is_distributed(path)[source]
bigdl.util.common.redire_spark_logs(bigdl_type='float', log_path='/opt/work/jenkins/workspace/BigDL-Release-Doc/BigDL/pyspark/docs/bigdl.log')[source]

Redirect spark logs to the specified path. :param bigdl_type: “double” or “float” :param log_path: the file path to be redirected to; the default file is under the current workspace named bigdl.log.

bigdl.util.common.set_optimizer_version(optimizerVersion, bigdl_type='float')[source]
bigdl.util.common.show_bigdl_info_logs(bigdl_type='float')[source]

Set BigDL log level to INFO. :param bigdl_type: “double” or “float”

bigdl.util.common.text_from_path(path)[source]
bigdl.util.common.to_list(a)[source]
bigdl.util.common.to_sample_rdd(x, y, numSlices=None)[source]

Conver x and y into RDD[Sample] :param x: ndarray and the first dimension should be batch :param y: ndarray and the first dimension should be batch :param numSlices: :return:

bigdl.util.engine module

bigdl.util.engine.check_spark_source_conflict(spark_home, pyspark_path)[source]
bigdl.util.engine.compare_version(version1, version2)[source]

Compare version strings. :param version1; :param version2; :return: 1 if version1 is after version2; -1 if version1 is before version2; 0 if two versions are the same.

bigdl.util.engine.exist_pyspark()[source]
bigdl.util.engine.get_bigdl_classpath()[source]

Get and return the jar path for bigdl if exists.

bigdl.util.engine.is_spark_below_2_2()[source]

Check if spark version is below 2.2

bigdl.util.engine.prepare_env()[source]

bigdl.util.tf_utils module

bigdl.util.tf_utils.convert(input_ops, output_ops, byte_order, bigdl_type)[source]

Convert tensorflow model to bigdl model :param input_ops: operation list used for input, should be placeholders :param output_ops: operations list used for output :return: bigdl model

bigdl.util.tf_utils.dump_model(path, graph=None, sess=None, ckpt_file=None, bigdl_type='float')[source]

Dump a tensorflow model to files. The graph will be dumped to path/model.pb, and the checkpoint will be dumped to path/model.bin

Parameters:
  • path – dump folder path
  • sess – if user pass in session, we assume that the variable of the graph in the session

has been inited :param graph: tensorflow graph. Default use the default graph of the session :param bigdl_type: model variable numeric type :return: nothing

bigdl.util.tf_utils.export_checkpoint(checkpoint_path)[source]

Export variable tensors from the checkpoint files.

Parameters:checkpoint_path – tensorflow checkpoint path
Returns:dictionary of tensor. The key is the variable name and the value is the numpy
bigdl.util.tf_utils.get_path(output_name, sess=None)[source]
bigdl.util.tf_utils.merge_checkpoint(input_graph, checkpoint, output_node_names, output_graph, sess)[source]

Get the variable values from the checkpoint file, and merge them to the GraphDef file Args: input_graph: the GraphDef file, doesn’t contain variable values checkpoint: the checkpoint file output_node_names: A list of string, the output names output_graph: String of the location and the name of the output graph

bigdl.util.tf_utils.save_variable_bigdl(tensors, target_path, bigdl_type='float')[source]

Save a variable dictionary to a Java object file, so it can be read by BigDL

Parameters:
  • tensors – tensor dictionary
  • target_path – where is the Java object file store
  • bigdl_type – model variable numeric type
Returns:

nothing

Module contents