manifestoR (version 1.2.4)

mp_corpus: Get documents from the Manifesto Corpus Database

Description

Documents are downloaded from the Manifesto Project Corpus Database. If CMP coding annotations are available, they are attached to the documents, otherwise raw texts are provided. The documents are cached in the working memory to ensure internal consistency, enable offline use and reduce online traffic.

Usage

mp_corpus(ids, apikey = NULL, cache = TRUE, codefilter = NULL,
  codefilter_layer = "cmp_code")

Arguments

ids

Information on which documents to get. This can either be a list of partys (as ids) and dates of elections as given to mp_metadata or a ManifestoMetadata object (data.frame) as returned by mp_metadata. Alternatively, ids can be a logical expression specifying a subset of the Manifesto Project's main dataset. It will be evaluated within the data.frame returned by mp_maindataset such that all its variables and functions thereof can be used in the expression.

apikey

API key to use. Defaults to NULL, resulting in using the API key set via mp_setapikey.

cache

Boolean flag indicating whether to use locally cached data if available.

codefilter

A vector of CMP codes to filter the documents: only quasi-sentences with the codes specified in codefilter are returned. If NULL, no filtering is applied

codefilter_layer

layer to which the codefilter should apply, defaults to cmp_code

Value

an object of Corpus's subclass ManifestoCorpus holding the available of the requested documents

Details

See mp_save_cache for ensuring reproducibility by saving cache and version identifier to the hard drive. See mp_update_cache for updating the locally saved content with the most recent version from the Manifesto Project Database API.

Examples

Run this code
# NOT RUN {
corpus <- mp_corpus(party == 61620 & rile > 10)

wanted <- data.frame(party=c(41320, 41320), date=c(200909, 201309))
mp_corpus(wanted)

mp_corpus(subset(mp_maindataset(), countryname == "France"))

partially_available <- data.frame(party=c(41320, 41320), date=c(200909, 200509))
mp_corpus(partially_available)
# }

Run the code above in your browser using DataCamp Workspace