A subcorpus_bundle
object combines a set of
subcorpus
objects in a list
in the the slot objects
.
The class inherits from the partition_bundle
and the bundle
class. Typically, a subcorpus_bundle
is generated by applying the
split
-method on a corpus
or subcorpus
.
# S4 method for subcorpus_bundle
show(object)# S4 method for subcorpus_bundle
merge(x, name = "", verbose = FALSE)
# S4 method for subcorpus
merge(x, y, ...)
# S4 method for subcorpus
split(
x,
s_attribute,
values = NULL,
prefix = "",
mc = getOption("polmineR.mc"),
verbose = TRUE,
progress = FALSE,
type = get_type(x)
)
# S4 method for corpus
split(
x,
s_attribute,
values = NULL,
prefix = "",
mc = getOption("polmineR.mc"),
verbose = TRUE,
progress = FALSE,
type = get_type(x),
xml = "flat"
)
# S4 method for subcorpus_bundle
split(
x,
s_attribute,
prefix = "",
progress = TRUE,
mc = getOption("polmineR.mc")
)
An object of class subcorpus_bundle
.
A corpus
, subcorpus
, or subcorpus_bundle
object.
The name of the new subcorpus
object.
Logical, whether to provide progress information.
A subcorpus
to be merged with x
.
Further subcorpus
objects to be merged with x
and y
.
The s-attribute to vary.
Values the s-attribute provided shall assume.
A character vector that will be attached as a prefix to partition names.
Logical, whether to use multicore parallelization.
Logical, whether to show progress bar.
The type of partition
to generate.
A logical
value.
Applying the split
-method to a subcorpus_bundle
-object
will iterate through the subcorpus, and apply split
on each
subcorpus
object in the bundle, splitting it up by the s-attribute
provided by the argument s_attribute
. The return value is a
subcorpus_bundle
, the names of which will be the names of the
incoming partition_bundle
concatenated with the s-attribute values
used for splitting. The argument prefix
can be used to achieve a
more descriptive name.
corpus("REUTERS") %>% split(s_attribute = "id") %>% summary()
# Merge multiple subcorpus objects
a <- corpus("GERMAPARLMINI") %>% subset(date == "2009-10-27")
b <- corpus("GERMAPARLMINI") %>% subset(date == "2009-10-28")
c <- corpus("GERMAPARLMINI") %>% subset(date == "2009-11-10")
y <- merge(a, b, c)
s_attributes(y, "date")
sc <- subset("GERMAPARLMINI", date == "2009-11-11")
b <- split(sc, s_attribute = "speaker")
p <- partition("GERMAPARLMINI", date = "2009-11-11")
y <- partition_bundle(p, s_attribute = "speaker")
gparl <- corpus("GERMAPARLMINI")
b <- split(gparl, s_attribute = "date")
# split up objects in partition_bundle by using partition_bundle-method
use("polmineR")
y <- corpus("GERMAPARLMINI") %>%
split(s_attribute = "date") %>%
split(s_attribute = "speaker")
summary(y)
Run the code above in your browser using DataLab