Learn R Programming

emuR (version 2.4.2)

runBASwebservice_all: Runs several BAS webservices, starting from an orthographic transcription

Description

This function calls the BAS webservices G2P, MAUS, Pho2Syl, MINNI and (if necessary) Chunker. Starting from an orthographic transcription, it derives a tokenized orthographical word tier using the G2P tool. It also derives canonical pronunciations (in SAMPA) for the words. If at least one audio file is longer than 60 seconds, the function then calls the Chunker webservice to presegment the recordings. Subsequently, the webservice MAUS is called to derive a phonetic segmentation. A second, rough segmentation is created by running the phoneme decoder MINNI. Finally, syllabification is performed by calling Pho2Syl. This function requires an internet connection.

Usage

runBASwebservice_all(
  handle,
  transcriptionAttributeDefinitionName,
  language,
  orthoAttributeDefinitionName = "ORT",
  canoAttributeDefinitionName = "KAN",
  mausAttributeDefinitionName = "MAU",
  minniAttributeDefinitionName = "MINNI",
  sylAttributeDefinitionName = "MAS",
  canoSylAttributeDefinitionName = "KAS",
  chunkAttributeDefinitionName = "TRN",
  runMINNI = TRUE,
  patience = 0,
  resume = FALSE,
  verbose = TRUE
)

Arguments

handle

emuDB handle

transcriptionAttributeDefinitionName

name of the attribute (not level!) containing an orthographic transcription.

language

language(s) to be used. If you pass a single string (e.g. "deu-DE"), this language will be used for all bundles. Alternatively, you can select the language for every bundle individually. To do so, you must pass a data frame with the columns session, bundle, language. This data frame must contain one row for every bundle in your emuDB. Up-to-date lists of the languages accepted by all webservices can be found here: https://clarin.phonetik.uni-muenchen.de/BASWebServices/services/help

orthoAttributeDefinitionName

attribute name for orthographic words

canoAttributeDefinitionName

attribute name for canonical pronunciations of words

mausAttributeDefinitionName

attribute name for the MAUS segmentation

minniAttributeDefinitionName

attribute name for the MINNI segmentation

sylAttributeDefinitionName

attribute name for syllable segmentation

canoSylAttributeDefinitionName

attribute name for syllabified canonical pronunciations of words

chunkAttributeDefinitionName

attribute name for the chunk segmentation. Please note that the chunk segmentation will only be generated if your emuDB contains audio files beyond the one minute mark.

runMINNI

if set to TRUE (the default) the MINNI service is also run. As the MINNI service contains less languages than the others it can be useful to turn this off.

patience

If a web service call fails, it is repeated a further n times, with n being the value of patience. Must be set to a value between 0 and 3.

resume

If a previous call to this function has failed (and you think you have fixed the issue that caused the error), you can set resume=TRUE to recover any progress made up to that point. This will only work if your R temporary directory has not been deleted or emptied in the meantime.

verbose

Display progress bars and other information

Details

All necessary level, attribute and link definitions are created in the process. Note that this function will run all BAS webservices with default parameters, with four exceptions:

  • Chunker: force=rescue

  • G2P: embed=maus

  • Pho2Syl: wsync=yes

  • MAUS: USETRN=[true if Chunker was called or transcription is a segment tier, false otherwise]

If you wish to change parameters, you must use the individual runBASwebservices functions. This will also allow you to carry out manual corrections in between the steps, or to use different languages for different webservices.

See Also

Other BAS webservice functions: runBASwebservice_chunker(), runBASwebservice_g2pForPronunciation(), runBASwebservice_g2pForTokenization(), runBASwebservice_maus(), runBASwebservice_minni(), runBASwebservice_pho2sylCanonical(), runBASwebservice_pho2sylSegmental()