build.corpus

A list of strings or a corpus from the <code>tm</code> package.

corpus

A vector of +1/-1 or TRUE/FALSE indicating which documents are considered relevant
and which are baseline.  The +1/-1 can contain 0 whcih means drop the document.

labeling

List of words that should be dropped from consideration.

banned

Level of output.  0 is no printed output.

verbosity

token.type

Pre-building a corpus allows for calling multiple textregs without doing a lot
of initial data processing (e.g., if you want to explore different ban lists or
regularization parameters)

Function for sparse regression on raw text, regressing a labeling
    vector onto a feature space consisting of all possible phrases.

build.corpus: Build a corpus that can be used in the textreg call.

Description

Usage

Arguments

Value

Details

Examples