A document-term matrix in the tm DocumentTermMatrix class or a TsparseMatrix from the Matrix class (spMatrix)
meta
A data.frame where rows are documents and columns are document meta information.
Should contain 2 columns: the document name/id and date.
The name/id column should match the rownames (i.e. document names) of the DTM, and its label is specified in the `id.var` argument.
The date column should be intepretable with as.POSIXct, and its label is specified in the `date.var` argument.
id.var
The label for the document name/id column in the `meta` data.frame. Default is "document_id"
date.var
The label for the document date column in the `meta` data.frame . default is "date"
Value
A data.frame with statistics for each term.
freq: The number of times a term occurred
doc.freq: The number of documents in which a term occured
days.n: The number of days on which a term occured
days.pct: The percentage of days on which a term occured
days.entropy: The entropy of the distribution of term frequency across days
days.entropy.norm: The normalized days.entropy, where 1 is a discrete uniform distribution