bigdl.dataset package¶
Submodules¶
bigdl.dataset.base module¶
bigdl.dataset.dataset module¶
bigdl.dataset.mnist module¶
- 
bigdl.dataset.mnist.extract_images(f)[source]¶
- Extract the images into a 4D uint8 numpy array [index, y, x, depth]. - Param: - f: A file object that can be passed into a gzip reader. - Returns: - data: A 4D unit8 numpy array [index, y, x, depth]. - Raise: - ValueError: If the bytestream does not start with 2051. 
- 
bigdl.dataset.mnist.read_data_sets(train_dir, data_type='train')[source]¶
- Parse or download mnist data if train_dir is empty. - Param: - train_dir: The directory storing the mnist data - Param: - data_type: Reading training set or testing set.It can be either “train” or “test” - Returns: - (ndarray, ndarray) representing (features, labels) features is a 4D unit8 numpy array [index, y, x, depth] representing each pixel valued from 0 to 255. labels is 1D unit8 nunpy array representing the label valued from 0 to 9. 
bigdl.dataset.movielens module¶
bigdl.dataset.news20 module¶
- 
bigdl.dataset.news20.get_glove_w2v(source_dir='./data/news20/', dim=100)[source]¶
- Parse or download the pre-trained glove word2vec if source_dir is empty. - Parameters: - source_dir – The directory storing the pre-trained word2vec
- dim – The dimension of a vector
 - Returns: - A dict mapping from word to vector