Learn R Programming

multicastR (version 1.3.0)

mc_eaf_to_xml: Convert EAF files to XML (WIP)

Description

mc_eaf_to_xml converts EAF files produced by the linguistic annotation software ELAN into one or multiple XML files. The EAF files must have the correct tier structure and names dictated by the Multi-CAST design, else conversion fails. Refer to the Multi-CAST documentation for details about the necessary structure of the EAF files, as well as about the structure of the XML files produced by this function.

Usage

mc_eaf_to_xml(vkey = "", readfrom = getwd(), recursive = FALSE,
  split = FALSE, writeto = getwd(), filename = "",
  skipempty = TRUE)

Arguments

vkey

Character. Version of the annotations. This information is not part of the EAF files, so it needs to be specified manually.

readfrom

Directory from which to read EAF files. Defaults to getwd.

recursive

Logical. If TRUE, the function recurses into subdirectories.

split

Logical. If FALSE, all EAF files that are read are bound into a single XML file. If TRUE, output consists of one XML file for each text read (which may be split across multiple EAF files), plus one XML file bundling all texts from each Multi-CAST corpus. Files combining all texts from each corpus are also produced.

writeto

A directory to which to write output. Defaults to getwd.

filename

A length 1 character vector containing the name of the written output. If empty, defaults to "multicast_YYMM", where 'YY' are the last two digits of the current year and 'MM' the current month. Ignored if split is TRUE, as in the latter case file names are instead generated from text metadata.

skipempty

Logical. If TRUE, empty leaf nodes in the XML will not be drawn.

Examples

Run this code
# NOT RUN {
  # read all EAF files in the current working directory
  # and write one XML file for each text to the same
  # location
  mc_eaf_to_xml()

  # same as above, but bundle all data into one large XML file
  # for entire collection plus one XML file for each corpus
  mc_eaf_to_xml(split = TRUE)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab