Learn R Programming

multicastR (version 1.3.0)

mc_eaf_to_tsv: Convert EAF files to TSV (WIP)

Description

mc_eaf_to_tsv converts EAF files produced by the linguistic annotation software ELAN into one or multiple tab-separated values (TSV) tables. The EAF files must have the correct tier structure with the correct tier names, or conversion fails. See the Multi-CAST documentation for details. File are added to the TSV table in the alphabetical order of their file names.

Usage

mc_eaf_to_tsv(readfrom = getwd(), recursive = FALSE, split = FALSE,
  write = FALSE, writeto = getwd(), filename = "")

Arguments

readfrom

Directory from which to read EAF files. Defaults to the current working directory.

recursive

Logical. If TRUE, the function recurses into subdirectories.

split

Logical. If FALSE, all EAF files that are read are bound into a single data table. If TRUE, a list of data tables is returned instead, with one list item per text (which may be split across multiple EAF files). If write is TRUE, written output is either a single TSV file (for split == TRUE) or one TSV file per text read (for split == FALSE). In the latter case TSV files combining all texts from each corpus are also produced.

write

Logical. If TRUE, also creates output in TSV format.

writeto

A directory to which to write output. Defaults to getwd. Ignored if write is FALSE.

filename

A length 1 character vector containing the name of the written output. If empty, defaults to "multicast_YYMM", where 'YY' are the last two digits of the current year and 'MM' the current month. Ignored if write is FALSE and/or if split is TRUE, as in the latter case file names are instead generated from text metadata.

Value

Either a data.table or list of data.tables of the form produced by multicast, containing the annotation values of the EAF files read.

Examples

Run this code
# NOT RUN {
  # read all EAF files in the current working directory,
  # returns a data table of the kind accessed by multicast()
  mc_eaf_to_tsv()

  # also produce a file 'mydata.tsv' containing all read data
  mc_eaf_to_tsv(write = TRUE, filename = "mydata")

  # instead of a single monolithic table, return a list
  # of tables and produce one TSV file for each text
  mc_eaf_to_tsv(write = TRUE, split = TRUE)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab