Learn R Programming

polmineR (version 0.7.4)

as.speeches: Split partition into speeches

Description

A method designed for corpora from the PolMine corpora of plenary protocols. A partition is split into speeches.

Usage

# S4 method for partition
as.speeches(.Object, sAttributeDates, sAttributeNames,
  gap = 500, mc = FALSE, verbose = TRUE, progress = TRUE)

Arguments

.Object

a partition .Object

sAttributeDates

the s-attribute that provides the dates of sessions

sAttributeNames

the s-attribute that provides the names of speakers

gap

number of tokens between strucs to identify speeches

mc

whether to use multicore, defaults to FALSE

verbose

logical, defaults to TRUE

progress

logical

Value

a partitionBundle object

Examples

Run this code
# NOT RUN {
  use("polmineR.sampleCorpus")
  bt <- partition("PLPRBTTXT", text_year = "2009")
  speeches <- as.speeches(bt, sAttributeDates = "text_date", sAttributeNames = "text_name")
  
  # step-by-step, not the fastest way
  speeches <- enrich(speeches, pAttribute = "word")
  tdm <- as.TermDocumentMatrix(speeches, col = "count")
  
  # fast option (counts performed when assembling the sparse matrix)
  # tdm <- as.TermDocumentMatrix(speeches, pAttribute = "word")
  # termsToDropList <- noise(tdm)
  # whatToDrop <- c("stopwords", "specialChars", "numbers", "minNchar")
  # termsToDrop <- unlist(lapply(whatToDrop, function(x) termsToDropList[[x]]))
  # tdm <- trim(tdm, termsToDrop = termsToDrop)
# }

Run the code above in your browser using DataLab