read_dtm_Blei_et_al(file, vocab = NULL) read_dtm_MC(file, scalingtype = NULL)
NULL(default), in which case the scaling will be inferred from the names of the files with non-zero entries found (see Details).
read_dtm_Blei_et_alreads the (List of Lists type sparse matrix) format employed by the Latent Dirichlet Allocation and Correlated Topic Model C codes by Blei et al (http://www.cs.princeton.edu/~blei).
MC is a toolkit for creating vector models from text documents (see http://www.cs.utexas.edu/users/dml/software/mc/). It employs a variant of Compressed Column Storage (CCS) sparse matrix format, writing data into several files with suitable names: e.g., a file with _dim appended to the base file name stores the matrix dimensions. The non-zero entries are stored in a file the name of which indicates the scaling type used: e.g., _tfx_nz indicates scaling by term frequency (t), inverse document frequency (f) and no normalization (x). See README in the MC sources for more information.
read_dtm_MC reads such sparse matrix information with argument
file giving the path with the base file name.
read_stm_MCin package slam.