Learn R Programming

polmineR (version 0.7.9)

count: Get counts.

Description

Count all tokens, or number of occurrences of a query (CQP syntax may be used), or matches for the query.

Usage

count(.Object, ...)

# S4 method for partition count(.Object, query = NULL, cqp = is.cqp, breakdown = FALSE, decode = TRUE, p_attribute = getOption("polmineR.p_attribute"), mc = getOption("polmineR.cores"), verbose = TRUE, progress = FALSE, ...)

# S4 method for partition_bundle count(.Object, query = NULL, cqp = FALSE, p_attribute = NULL, freq = FALSE, total = TRUE, mc = FALSE, progress = TRUE, verbose = FALSE, ...)

# S4 method for character count(.Object, query = NULL, cqp = is.cqp, p_attribute = getOption("polmineR.p_attribute"), breakdown = FALSE, sort = FALSE, decode = TRUE, verbose = TRUE, ...)

# S4 method for vector count(.Object, corpus, p_attribute, ...)

# S4 method for Corpus count(.Object, query = NULL, p_attribute)

Arguments

.Object

A partition or partition_bundle, or a length-one character vector providing the name of a corpus.

...

Further arguments.

query

A character vector (one or multiple terms), CQP syntax can be used.

cqp

Either logical (TRUE if query is a CQP query), or a function to check whether query is a CQP query or not (defaults to is.query auxiliary function).

breakdown

Logical, whether to report number of occurrences for different matches for a query.

decode

Logical, whether to turn token ids into decoded strings (only if query is NULL).

p_attribute

The p-attribute(s) to use.

mc

Logical, whether to use multicore (defaults to FALSE).

verbose

Logical, whether to be verbose.

progress

Logical, whether to show progress bar.

freq

Logical, if FALSE, counts will be reported, if TRUE, (relative) frequencies are added to table.

total

Defaults to FALSE, if TRUE, the total value of counts (column named 'TOTAL') will be amended to the data.table that is returned.

sort

Logical, whether to sort table with counts (in stat slot).

corpus

The name of a CWB corpus.

Value

A data.table if argument query is used, a count-object, if query is NULL and .Object is a character vector (referring to a corpus) or a partition, a count_bundle-object, if .Object is a partition_bundle.

Details

If .Object is a partiton_bundle, the data.table returned will have the queries in the columns, and as many rows as there are in the partition_bundle.

If .Object is a length-one character vector and query is NULL, the count is performed for the whole partition.

If breakdown is TRUE and one query is supplied, the function returns a frequency breakdown of the results of the query. If several queries are supplied, frequencies for the individual queries are retrieved.

References

Baker, Paul (2006): Using Corpora in Discourse Analysis. London: continuum, p. 47-69 (ch. 3).

See Also

For a metadata-based breakdown of counts (i.e. tabulation by s-attributes), see dispersion.

count

Examples

Run this code
# NOT RUN {
use("polmineR")
debates <- partition("GERMAPARLMINI", date = ".*", regex=TRUE)
count(debates, query = "Arbeit") # get frequencies for one token
count(debates, c("Arbeit", "Freizeit", "Zukunft")) # get frequencies for multiple tokens
  
count("GERMAPARLMINI", query = c("Migration", "Integration"), p_attribute = "word")

debates <- partition_bundle(
  "GERMAPARLMINI", s_attribute = "date", values = NULL,
  regex = TRUE, mc = FALSE, verbose = FALSE
)
y <- count(debates, query = "Arbeit", p_attribute = "word")
y <- count(debates, query = c("Arbeit", "Migration", "Zukunft"), p_attribute = "word")
  
count("GERMAPARLMINI", '"Integration.*"', breakdown = TRUE)

P <- partition("GERMAPARLMINI", date = "2009-11-11")
count(P, '"Integration.*"', breakdown = TRUE)
# }

Run the code above in your browser using DataLab