path to vocabulary file. File is assumed to be a text
file, with one token per line, with the line number corresponding to the
index of that token in the vocabulary.
Value
In the BERT Python code, the vocab is returned as an OrderedDict
from the collections package. Here we return the vocab as a named integer
vector. Names are tokens in vocabulary, values are integer indices.