Often one has an R factor in which one or more levels are rare in the
data. This could cause problems, say in performing cross-validation; a
level in the test set might be "new," not having appeared in the
training set. Toward this end, factorToTopLevels will remove
rare levels from a factor; dataToTopLevels applies this to an
entire data frame.
Also toward this end, the function levelCounts simply applies
table() to each column of data, returning the result as an
R list. (If more than 10 levels, it returns NA.
The function cartesianFactor generates a "superfactor" from
individual ones; e.g. if factors f1 and f2 have n1 and n2 levels, the
output is a new factor with n1 * n2 levels.
The function qeRareLevels checks all columns in a data frame in
terms of being an R factor with rare levels.