Usage
audio_text(audios, userpwd, keep_data = "true", callback = NULL, model = "en-US_BroadbandModel", continuous = FALSE, inactivity_timeout = 30, keywords = list(), keywords_threshold = NA, max_alternatives = 1, word_alternatives_threshold = NA, word_confidence = FALSE, timestamps = FALSE, profanity_filter = TRUE, smart_formatting = FALSE, content_type = "audio/wav")
Arguments
audios
Character vector (list) of paths to images or to .zip files containing
upto 100 images.
userpwd
Character scalar containing username:password for the service.
keep_data
Character scalar specifying whether to share your data with
Watson services for the purpose of training their models.
callback
Function that can be applied to responses to examine http status,
headers, and content, to debug or to write a custom parser for content.
The default callback parses content into a data.frame while dropping other
response values to make the output easily passable to tidyverse packages like
dplyr or ggplot2. For further details or debugging one can pass a print or a
more compicated function.
model
Character scalar specifying language and bandwidth model. Alternatives
are ar-AR_BroadbandModel, en-UK_BroadbandModel, en-UK_NarrowbandModel,
en-US_NarrowbandModel, es-ES_BroadbandModel, es-ES_NarrowbandModel,
fr-FR_BroadbandModel, ja-JP_BroadbandModel, ja-JP_NarrowbandModel,
pt-BR_BroadbandModel, pt-BR_NarrowbandModel, zh-CN_BroadbandModel,
zh-CN_NarrowbandModel.
continuous
Logical scalar specifying whether to return after a first
end-of-speech incident (long pause) or to wait to combine results.
inactivity_timeout
Integer scalar giving the number of seconds after which
the result is returned if no speech is detected.
keywords
List of keywords to be detected in the speech stream.
keywords_threshold
Double scalar from 0 to 1 specifying the lower bound on
confidence to accept detected keywords in speech.
max_alternatives
Integer scalar giving the maximum number of alternative
transcripts to return.
word_alternatives_threshold
Double scalar from 0 to 1 giving lower bound
on confidence of possible words.
word_confidence
Logical scalar indicating whether to return confidence for
each word.
timestamps
Logical scalar indicating whether to return time alignment for
each word.
profanity_filter
Logical scalar indicating whether to censor profane words.
smart_formatting
Logical scalar indicating whether dates, times, numbers, etc.
are to be formatted nicely in the transcript.
content_type
Character scalar showing format of the audio file. Alternatives
are audio/flac, audio/l16;rate=n;channels=k (16 channel limit),
audio/wav (9 channel limit), audio/ogg;codecs=opus,
audio/basic (narrowband models only).