This function calls the BAS webservices G2P, MAUS, Pho2Syl, MINNI and (if necessary) Chunker. Starting from an orthographic transcription, it derives a tokenized orthographical word tier using the G2P tool. It also derives canonical pronunciations (in SAMPA) for the words. If at least one audio file is longer than 60 seconds, the function then calls the Chunker webservice to presegment the recordings. Subsequently, the webservice MAUS is called to derive a phonetic segmentation. A second, rough segmentation is created by running the phoneme decoder MINNI. Finally, syllabification is performed by calling Pho2Syl. This function requires an internet connection.
runBASwebservice_all(
handle,
transcriptionAttributeDefinitionName,
language,
orthoAttributeDefinitionName = "ORT",
canoAttributeDefinitionName = "KAN",
mausAttributeDefinitionName = "MAU",
minniAttributeDefinitionName = "MINNI",
sylAttributeDefinitionName = "MAS",
canoSylAttributeDefinitionName = "KAS",
chunkAttributeDefinitionName = "TRN",
runMINNI = TRUE,
patience = 0,
resume = FALSE,
verbose = TRUE
)
emuDB handle
name of the attribute (not level!) containing an orthographic transcription.
language(s) to be used. If you pass a single string (e.g. "deu-DE"), this language will be used for all bundles. Alternatively, you can select the language for every bundle individually. To do so, you must pass a data frame with the columns session, bundle, language. This data frame must contain one row for every bundle in your emuDB. Up-to-date lists of the languages accepted by all webservices can be found here: https://clarin.phonetik.uni-muenchen.de/BASWebServices/services/help
attribute name for orthographic words
attribute name for canonical pronunciations of words
attribute name for the MAUS segmentation
attribute name for the MINNI segmentation
attribute name for syllable segmentation
attribute name for syllabified canonical pronunciations of words
attribute name for the chunk segmentation. Please note that the chunk segmentation will only be generated if your emuDB contains audio files beyond the one minute mark.
if set to TRUE
(the default) the MINNI service is also run. As the MINNI service contains
less languages than the others it can be useful to turn this off.
If a web service call fails, it is repeated a further n times, with n being the value of patience. Must be set to a value between 0 and 3.
If a previous call to this function has failed (and you think you have fixed the issue that caused the error), you can set resume=TRUE to recover any progress made up to that point. This will only work if your R temporary directory has not been deleted or emptied in the meantime.
Display progress bars and other information
All necessary level, attribute and link definitions are created in the process. Note that this function will run all BAS webservices with default parameters, with four exceptions:
Chunker: force=rescue
G2P: embed=maus
Pho2Syl: wsync=yes
MAUS: USETRN=[true if Chunker was called or transcription is a segment tier, false otherwise]
If you wish to change parameters, you must use the individual runBASwebservices functions. This will also allow you to carry out manual corrections in between the steps, or to use different languages for different webservices.
Other BAS webservice functions:
runBASwebservice_chunker()
,
runBASwebservice_g2pForPronunciation()
,
runBASwebservice_g2pForTokenization()
,
runBASwebservice_maus()
,
runBASwebservice_minni()
,
runBASwebservice_pho2sylCanonical()
,
runBASwebservice_pho2sylSegmental()