Shape the token sequence to the specified length.
Shape the token sequence to the specified length. The sequence would be either padded or truncated.
the desired seq length
truncated from pre or post.
Transform sample text into tokens and ignore those unknown tokens.
Transform sample text into tokens and ignore those unknown tokens.
Indicate the included words.
Simple tokenizer to split text into separated tokens.
Simple tokenizer to split text into separated tokens.
text to be split.
convert to lower case or not.
An array of separated tokens.
Transform word to pre-trained vector.
Transform word to pre-trained vector.
size of the pre-trained vector
pre-trained word2Vec