Learn R Programming

⚠️There's a newer version (0.1.5) of this package.Take me there.

textreg (version 0.1.1)

n-gram Text Regression, aka Concise Comparative Summarization

Description

Function for sparse regression on raw text, regressing a labeling vector onto a feature space consisting of all possible phrases.

Copy Link

Version

Install

install.packages('textreg')

Monthly Downloads

479

Version

0.1.1

License

GPL (>= 2)

Maintainer

Luke Miratrix

Last Published

February 2nd, 2015

Functions in textreg (0.1.1)

grab.fragments

Grab all fragments in a corpus with given phrase.
make.similarity.matrix

Calculate similarity matrix for set of phrases.
cpp_textreg

Driver function for the C++ function.
make.appearance.matrix

Make phrase appearance matrix from textreg result.
make.count.table

Count number of times documents have a given phrase.
path.matrix.chart

Plot optimization path of textreg.
is.textreg.result

Is object a textreg.result object?
find.CV.C

K-fold cross-validation to determine optimal tuning parameter
save.corpus.to.files

Save corpus to text (and RData) file.
phrase.count

Count phrase appearance.
clean.text

Clean text and get it ready for textreg.
sample.fragments

Sample fragments of text to contextualize a phrase.
textreg-package

Sparse regression package for text that allows for multiple word phrases.
make.phrase.correlation.chart

Generate visualization of phrase overlap.
make.phrase.matrix

Make a table of where phrases appear in a corpus
make.list.table

Collate multiple regression runs.
stem.corpus

Step corpus with annotation.
dirtyBathtub

Sample of raw-text OSHA accident summaries.
predict.textreg.result

Predict labeling with the selected phrases.
print.textreg.result

Pretty print results of textreg regression.
reformat.textreg.model

Clean up output from textreg.
is.fragment.sample

Is object a fragment.sample object?
tm_gregexpr

Call gregexpr on the content of a tm Corpus.
list.table.chart

Graphic showing multiple word lists side-by-side.
make.CV.chart

Plot K-fold cross validation curves
cluster.phrases

Cluster phrases based on similarity of appearance.
phrase.matrix

Make matrix of where phrases appear in corpus.
make_search_phrases

Convert phrases to appropriate search string.
print.fragment.sample

Pretty print results of phrase sampling object.
find.threshold.C

Conduct permutation test on labeling to get null distribution of regularization parameter.
plot.textreg.result

Plot the sequence of features as they are introduced with the textreg gradient descent program.
make.path.matrix

Generate matrix describing gradient descent path of textreg.
testCorpora

Some small, fake test corpora.
textreg

Sparse regression of labeling vector onto all phrases in a corpus.
calc.loss

Calculate total loss of model (Squared hinge loss).
bathtub

Sample of cleaned OSHA accident summaries.
convert.tm.to.character

Convert tm corpus to vector of strings.
cpp_build.corpus

Driver function for the C++ function.
print.textreg.corpus

Pretty print textreg corpus object
is.textreg.corpus

Is object a textreg.corpus object?
build.corpus

Build a corpus that can be used in the textreg call.