Only top "num_words" most frequent words will be taken into account. Only words known by the tokenizer will be taken into account.
texts_to_sequences_generator(tokenizer, texts)Tokenizer
Vector/list of texts (strings).
Generator which yields individual sequences
Other text tokenization: fit_text_tokenizer,
  sequences_to_matrix,
  text_tokenizer,
  texts_to_matrix,
  texts_to_sequences