com.intel.analytics.bigdl.dataset.text

Dictionary

class Dictionary extends Serializable

Class that help build a dictionary either from tokenized text or from saved dictionary

Linear Supertypes
Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. Dictionary
  2. Serializable
  3. AnyRef
  4. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new Dictionary(directory: String)

  2. new Dictionary(sentences: Stream[Array[String]], vocabSize: Int)

  3. new Dictionary(words: Array[String], vocabSize: Int)

  4. new Dictionary(sentences: Iterator[Array[String]], vocabSize: Int)

  5. new Dictionary(dataset: RDD[Array[String]], vocabSize: Int)

  6. new Dictionary()

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  7. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  8. def discardVocab(): Array[String]

    Return the array of all discarded words.

  9. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  10. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  11. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  12. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  13. def getDiscardSize(): Int

    Selected words with top-k frequencies and discarded the remaining words.

    Selected words with top-k frequencies and discarded the remaining words. Return the length of the discarded words.

  14. def getIndex(word: String): Int

    return the encoding number of a word, if word does not existed in the dictionary, it will return the dictionary length as the default index.

    return the encoding number of a word, if word does not existed in the dictionary, it will return the dictionary length as the default index.

    word

  15. def getVocabSize(): Int

    The length of the vocabulary

  16. def getWord(index: Int): String

    return the word with regard to the index, if index is out of boundary, it will randomly return a word in the discarded word list.

    return the word with regard to the index, if index is out of boundary, it will randomly return a word in the discarded word list. If discard word list is Empty, it will randomly return a word in the existed dictionary.

    index

  17. def getWord(index: Double): String

  18. def getWord(index: Float): String

  19. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  20. def index2Word(): Map[Int, String]

  21. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  22. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  23. final def notify(): Unit

    Definition Classes
    AnyRef
  24. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  25. def print(): Unit

    print word-to-index dictionary

  26. def printDiscard(): Unit

    print discard dictionary

  27. def save(saveFolder: String): Unit

    Save the dictionary, discarded words to the saveFolder directory.

    Save the dictionary, discarded words to the saveFolder directory.

    saveFolder

  28. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  29. def toString(): String

    Definition Classes
    AnyRef → Any
  30. def vocabulary(): Array[String]

    Return the array of all selected words.

  31. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  32. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  33. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  34. def word2Index(): Map[String, Int]

    Word encoding by its index in the dictionary

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped