emuR (version 2.4.2)

runBASwebservice_g2pForTokenization: Tokenizes an orthographic transcription.


This function calls the webservice G2P to break up a transcription into tokens, or words. In addition to tokenization, G2P performs normalization of numbers and other special words. A call to this function is usually followed by a call to runBASwebservice_g2pForPronunciation. This function requires an internet connection.


  orthoAttributeDefinitionName = "ORT",
  params = list(),
  patience = 0,
  resume = FALSE,
  verbose = TRUE



emuDB handle


name of the attribute (not level!) containing an orthographic transcription.


language(s) to be used. If you pass a single string (e.g. "deu-DE"), this language will be used for all bundles. Alternatively, you can select the language for every bundle individually. To do so, you must pass a data frame with the columns session, bundle, language. This data frame must contain one row for every bundle in your emuDB. Up-to-date lists of the languages accepted by all webservices can be found here: https://clarin.phonetik.uni-muenchen.de/BASWebServices/services/help


attribute name for orthographic words


named list of parameters to be passed on to the webservice. It is your own responsibility to ensure that these parameters are compatible with the webservice API (see https://clarin.phonetik.uni-muenchen.de/BASWebServices/services/help). Some options accepted by the API (e.g. output format) cannot be set when calling a webservice from within emuR, and will be overridden. If file parameters are used please wrap the file path in httr::upload_file("/path/2/file/rules.nrul").


If a web service call fails, it is repeated a further n times, with n being the value of patience. Must be set to a value between 0 and 3.


If a previous call to this function has failed (and you think you have fixed the issue that caused the error), you can set resume=TRUE to recover any progress made up to that point. This will only work if your R temporary directory has not been deleted or emptied in the meantime.


Display progress bars and other information


All necessary level, link and attribute definitions are created in the process.

See Also

Other BAS webservice functions: runBASwebservice_all(), runBASwebservice_chunker(), runBASwebservice_g2pForPronunciation(), runBASwebservice_maus(), runBASwebservice_minni(), runBASwebservice_pho2sylCanonical(), runBASwebservice_pho2sylSegmental()