Learn R Programming

phm (version 2.1.2)

phraseDoc: phraseDoc Creation

Description

Create an object of class phraseDoc. This will hold all principal phrases of a collection of texts that occur a minimum number of times, plus the texts they occur in and their position within those texts.

Usage

phraseDoc(
  text,
  ids = NULL,
  mn = 2,
  mx = 8,
  ssw = stopStartWords(),
  sew = stopEndWords(),
  sp = stopPhrases(),
  min.freq = 2,
  max.phrases = 1500,
  shiny = FALSE,
  silent = TRUE
)

Value

Object of class phraseDoc

Arguments

text

a character vector with each element the text of a document, or a corpus

ids

a character vector with identifiers for each text

mn

Minimum number of words in a phrase.

mx

Maximum number of words in a phrase.

ssw

A set of words no phrase should start with.

sew

A set of words no phrase should end with.

sp

A set of phrases to be excluded.

min.freq

The minimum frequency of phrases to be included.

max.phrases

Maximum number of phrases to be included.

shiny

TRUE if called from a shiny program. This will allow progress to be recorded on a progress meter; the function uses about 100 progress steps, so it should be created inside a withProgress function with the argument max set to at least 100.

silent

TRUE if you do not want progress messages.

Examples

Run this code
tst=c("This is a test text",
      "This is a test text 2",
      "This is another test text",
      "This is another test text 2",
      "This girl will test text that man",
      "This boy will test text that man")
phraseDoc(tst)

Run the code above in your browser using DataLab