Learn R Programming

textreg (version 0.1.3)

clean.text: Clean text and get it ready for textreg.

Description

Changes multiline documents to single line. Strips extra whitespace and punctuation. Changes digits to 'X's. Non-alpha characters converted to spaces.

Usage

clean.text(bigcorp)

Arguments

bigcorp
A tm Corpus object.

Examples

Run this code
library( tm )
txt = c( "thhis s! and bonkus  4:33pm and Jan 3, 2015. ",
         "   big    space\n     dawg-ness?")
a <- clean.text( Corpus( VectorSource( txt ) ) )
a[[1]]

Run the code above in your browser using DataLab