gofastr (version 0.3.0)

select_documents: Select Documents rom a TermDocumentMatrix/DocumentTermMatrix

Description

Select documents from a TermDocumentMatrix or DocumentTermMatrix matching a regular expression.

Usage

select_documents(x, pattern, invert = FALSE, ...)

Arguments

pattern

A regex pattern used to select documents.

invert

logical. If TRUE the pattern is inverted to exclude these documents.

Other arguments passed to grepl (perl = TRUE is hard coded).

Value

Returns a TermDocumentMatrix or DocumentTermMatrix.

Examples

Run this code
# NOT RUN {
(x <-with(presidential_debates_2012, q_dtm(dialogue, paste(time, person, sep = "_"))))
select_documents(x, 'romney', ignore.case=TRUE)
select_documents(x, '^(?!.*romney).*$', ignore.case = TRUE)      # regex way to invert
select_documents(x, 'romney', ignore.case = TRUE, invert = TRUE) # easier way to invert
(y <- with(presidential_debates_2012, q_tdm(dialogue, paste(time, person, sep = "_"))))
select_documents(y, '[2-3]')
# }

Run the code above in your browser using DataCamp Workspace