Learn R Programming

multicastR (version 1.2.0)

multicast: Access Multi-CAST annotation data

Description

multicast downloads the Multi-CAST annotation data from the servers of the University of Bamberg and outputs them as a data.table. As the Multi-CAST collection is amenable to extension by additional data sets and annotation schemes, multicast takes an optional argument to select earlier versions of the annotation data to ensure scientific accountability and reproducability.

Usage

multicast(vkey, legacy.colnames = FALSE)

Arguments

vkey

A numeric or character vector of length 1 specifying the requested version of the annotation values. Must be one of the four-digit version keys in the first column of mc_index, or empty. If empty or no value is supplied, multicast automatically retrieves the most recent version of the annotations. See the examples below for an illustration.

legacy.colnames

If TRUE, renames the text and gword columns to what they were called prior to version 1.1.0 of the package (i.e. file, word). This option will be removed in the future.

Value

A data.table with eleven columns:

[, 1] corpus

The name of the corpus.

[, 2] text

The title of the text. If legacy.colnames is TRUE, this column is named file instead.

[, 3] uid

The utterance identifier. Uniquely identifies an utterance within a text.

[, 4] gword

Grammatical words. The tokenized utterances in the object language. If legacy.colnames is TRUE, this column is named word instead.

[, 5] gloss

Morphological glosses following the Leipzig Glossing Rules.

[, 6] graid

Annotations using the GRAID scheme (Haig & Schnell 2014).

[, 7] gform

The form symbol of a GRAID gloss.

[, 8] ganim

The person-animacy symbol of a GRAID gloss.

[, 9] gfunc

The function symbol of a GRAID gloss.

[, 10] refind

Referent tracking using the RefIND scheme (Schiborr et al. 2018).

[, 11] reflex

The information status of newly introduced referents, using a simplified version of the RefLex scheme (Riester & Baumann 2017).

Licensing

The Multi-CAST annotation data accessed by the multicast method is published under a Create Commons Attribution 4.0 International (CC-BY 4.0) licence (https://creativecommons.org/licenses/by-sa/4.0/). Please refer to the collection documentation for information on how to give proper credit to its contributors.

Citing Multi-CAST

Data from the Multi-CAST collection should be cited as:

If for some reason you need to cite this package on its own, please refer to citation(multicastR).

References

See Also

mc_index, mc_referents

Examples

Run this code
# NOT RUN {
  # retrieve and print the most recent version of the
  # Multi-CAST annotations
  multicast()

  # retrieve and print the version of the annotation data
  # published in May 2019
  multicast(1905)   # or: multicast("1905")
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab