bigdl.dataset package¶
Submodules¶
bigdl.dataset.base module¶
bigdl.dataset.dataset module¶
bigdl.dataset.mnist module¶
-
bigdl.dataset.mnist.
extract_images
(f)[source]¶ Extract the images into a 4D uint8 numpy array [index, y, x, depth].
Param: f: A file object that can be passed into a gzip reader. Returns: data: A 4D unit8 numpy array [index, y, x, depth]. Raise: ValueError: If the bytestream does not start with 2051.
-
bigdl.dataset.mnist.
read_data_sets
(train_dir, data_type='train')[source]¶ Parse or download mnist data if train_dir is empty.
Param: train_dir: The directory storing the mnist data Param: data_type: Reading training set or testing set.It can be either “train” or “test” Returns: (ndarray, ndarray) representing (features, labels) features is a 4D unit8 numpy array [index, y, x, depth] representing each pixel valued from 0 to 255. labels is 1D unit8 nunpy array representing the label valued from 0 to 9.
bigdl.dataset.movielens module¶
bigdl.dataset.news20 module¶
-
bigdl.dataset.news20.
get_glove_w2v
(source_dir='./data/news20/', dim=100)[source]¶ Parse or download the pre-trained glove word2vec if source_dir is empty.
Parameters: - source_dir – The directory storing the pre-trained word2vec
- dim – The dimension of a vector
Returns: A dict mapping from word to vector