- repeatBout
number of times the whole bout should be repeated
- nSyl
number of syllables in the bout. `pitchGlobal`, `amplGlobal`, and
`formants` span multiple syllables, but not multiple bouts
- sylLen
average duration of each syllable, ms (vectorized)
- pauseLen
average duration of pauses between syllables, ms (can be
negative between bouts: force with invalidArgAction = 'ignore')
(vectorized)
- pitch
a numeric vector of f0 values in Hz or a dataframe
specifying the time (ms or 0 to 1) and value (Hz) of each anchor, hereafter
"anchor format". These anchors are used to create a smooth contour of
fundamental frequency f0 (pitch) within one syllable
- pitchGlobal
unlike pitch
, these anchors are
used to create a smooth contour of average f0 across multiple syllables.
The values are in semitones relative to the existing pitch, i.e. 0 = no
change (anchor format)
- glottis
anchors for specifying the proportion of a
glottal cycle with closed glottis, % (0 = no modification, 100 = closed
phase as long as open phase); numeric vector or dataframe specifying time
and value (anchor format)
- temperature
hyperparameter for regulating the amount of stochasticity
in sound generation
- tempEffects
a list of scaling coefficients regulating the effect of
temperature on particular parameters. To change, specify just those pars
that you want to modify (1 = default, 0 = no stochastic behavior).
amplDep, pitchDep, noiseDep
: random fluctuations of user-specified
amplitude / pitch / noise anchors; amplDriftDep
: drift of amplitude
mirroring pitch drift; formDisp
: dispersion of stochastic formants;
formDrift
: formant frequencies; glottisDep
: proportion of
glottal cycle with closed glottis; pitchDriftDep
: amount of slow
random drift of f0; pitchDriftFreq
: frequency of slow random drift
of f0; rolloffDriftDep
: drift of rolloff mirroring pitch drift;
specDep
: rolloff, rolloffNoise, nonlinear effects, attack;
subDriftDep
: drift of subharmonic frequency and bandwidth mirroring
pitch drift; sylLenDep
: duration of syllables and pauses
- maleFemale
hyperparameter for shifting f0 contour, formants, and
vocalTract to make the speaker appear more male (-1...0) or more female
(0...+1); 0 = no change
- creakyBreathy
hyperparameter for a rough adjustment of voice quality
from creaky (-1) to breathy (+1); 0 = no change
- nonlinBalance
hyperparameter for regulating the (approximate)
proportion of sound with different regimes of pitch effects (none /
subharmonics only / subharmonics and jitter). 0% = no noise; 100% = the
entire sound has jitter + subharmonics. Ignored if temperature = 0
- nonlinRandomWalk
a numeric vector specifying the timing of nonliner
regimes: 0 = none, 1 = subharmonics, 2 = subharmonics + jitter + shimmer
- subRatio
a positive integer giving the ratio of f0 (the main
fundamental) to g0 (a lower frequency): 1 = no subharmonics, 2 = period
doubling regardless of pitch changes, 3 = period tripling, etc; subRatio
overrides subFreq (anchor format)
- subFreq
instead of a specific number of subharmonics (subRatio), we
can specify the approximate g0 frequency (Hz), which is used only if
subRatio = 1 and is adjusted to f0 so f0/g0 is always an integer (anchor
format)
- subDep
the depth of subharmonics relative to the main frequency
component (f0), %. 0: no subharmonics; 100: g0 harmonics are as strong as
the nearest f0 harmonic (anchor format)
- subWidth
Width of subharmonic sidebands - regulates how rapidly
g-harmonics weaken away from f-harmonics: large values like the default
10000 means that all g0 harmonics are equally strong (anchor format)
- shortestEpoch
minimum duration of each epoch with unchanging
subharmonics regime or formant locking, in ms
- jitterLen
duration of stable periods between pitch jumps, ms. Use a
low value for harsh noise, a high value for irregular vibrato or shaky
voice (anchor format)
- jitterDep
cycle-to-cycle random pitch variation, semitones (anchor
format)
- vibratoFreq
the rate of regular pitch modulation, or vibrato, Hz
(anchor format)
- vibratoDep
the depth of vibrato, semitones (anchor format)
- shimmerDep
random variation in amplitude between individual glottal
cycles (0 to 100% of original amplitude of each cycle) (anchor format)
- shimmerLen
duration of stable periods between amplitude jumps, ms. Use
a low value for harsh noise, a high value for shaky voice (anchor format)
- attackLen
duration of fade-in / fade-out at each end of syllables and
noise (ms): a vector of length 1 (symmetric) or 2 (separately for fade-in
and fade-out)
- rolloff
basic rolloff from lower to upper harmonics, db/octave
(exponential decay). All rolloff parameters are in anchor format. See
getRolloff
for more details
- rolloffOct
basic rolloff changes from lower to upper harmonics
(regardless of f0) by rolloffOct
dB/oct. For example, we can get
steeper rolloff in the upper part of the spectrum
- rolloffKHz
rolloff changes linearly with f0 by rolloffKHz
dB/kHz. For ex., -6 dB/kHz gives a 6 dB steeper basic rolloff as f0 goes up
by 1000 Hz
- rolloffParab
an optional quadratic term affecting only the first
rolloffParabHarm
harmonics. The middle harmonic of the first
rolloffParabHarm
harmonics is amplified or dampened by
rolloffParab
dB relative to the basic exponential decay
- rolloffParabHarm
the number of harmonics affected by
rolloffParab
- rolloffExact
user-specified exact strength of harmonics: a vector or
matrix with one row per harmonic, scale 0 to 1 (overrides all other rolloff
parameters)
- lipRad
the effect of lip radiation on source spectrum, dB/oct (the
default of +6 dB/oct produces a high-frequency boost when the mouth is
open)
- noseRad
the effect of radiation through the nose on source spectrum,
dB/oct (the alternative to lipRad
when the mouth is closed)
- mouthOpenThres
open the lips (switch from nose radiation to lip
radiation) when the mouth is open >mouthOpenThres
, 0 to 1
- formants
either a character string like "aaui" referring to default
presets for speaker "M1" or a list of formant times, frequencies,
amplitudes, and bandwidths (see ex. below). formants = NA
defaults
to schwa. Time stamps for formants and mouthOpening can be specified in ms
or an any other arbitrary scale. See getSpectralEnvelope
for
more details
- formantDep
scale factor of formant amplitude (1 = no change relative
to amplitudes in formants
)
- formantDepStoch
the amplitude of additional stochastic formants added
above the highest specified formant, dB (only if temperature > 0)
- formantWidth
scale factor of formant bandwidth (1 = no change)
- formantCeiling
frequency to which stochastic formants are calculated,
in multiples of the Nyquist frequency; increase up to ~10 for long vocal
tracts to avoid losing energy in the upper part of the spectrum
- formantLocking
the approximate proportion of sound in which one of the
harmonics is locked to the nearest formant, 0 = none, 1 = the entire sound
(anchor format)
- vocalTract
the length of vocal tract, cm. Used for calculating formant
dispersion (for adding extra formants) and formant transitions as the mouth
opens and closes. If NULL
or NA
, the length is estimated
based on specified formant frequencies, if any (anchor format)
- amDep
amplitude modulation (AM) depth, %. 0: no change; 100: AM with
amplitude range equal to the dynamic range of the sound (anchor format)
- amFreq
AM frequency, Hz (anchor format)
- amType
"sine" = sinusoidal, "logistic" = logistic (default)
- amShape
ignore if amType = "sine", otherwise determines the shape of
non-sinusoidal AM: 0 = ~sine, -1 = notches, +1 = clicks (anchor format)
- noise
loudness of turbulent noise (0 dB = as loud as
voiced component, negative values = quieter) such as aspiration, hissing,
etc (anchor format)
- formantsNoise
the same as formants
, but for unvoiced instead of
voiced component. If NA (default), the unvoiced component will be filtered
through the same formants as the voiced component, approximating aspiration
noise [h]
- rolloffNoise, noiseFlatSpec
linear rolloff of the excitation source for
the unvoiced component, rolloffNoise
dB/kHz (anchor format) applied
above noiseFlatSpec
Hz
- rolloffNoiseExp
exponential rolloff of the excitation source for the
unvoiced component, dB/oct (anchor format) applied above 0 Hz
- noiseAmpRef
noise amplitude is defined relative to: "f0" = the
amplitude of the first partial (fundamental frequency), "source" = the
amplitude of the harmonic component prior to applying formants, "filtered"
= the amplitude of the harmonic component after applying formants
- mouth
mouth opening (0 to 1, 0.5 = neutral, i.e. no
modification) (anchor format)
- ampl
amplitude envelope (dB, 0 = max amplitude) (anchor
format)
- amplGlobal
global amplitude envelope spanning
multiple syllables (dB, 0 = no change) (anchor format)
- smoothing
a list of parameters passed to
getSmoothContour
to control the interpolation and smoothing
of contours: interpol (approx / spline / loess), loessSpan, discontThres,
jumpThres
- samplingRate
sampling frequency, Hz
- windowLength
length of FFT window, ms
- overlap
FFT window overlap, %. For allowed values, see
istft
- addSilence
silence before and after the bout, ms: a vector of length 1
(symmetric) or 2 (different duration of silence before/after the sound)
- pitchFloor, pitchCeiling
lower & upper bounds of f0
- pitchSamplingRate
sampling frequency of the pitch contour only, Hz.
Low values reduce processing time. Set to pitchCeiling
for optimal
speed or to samplingRate
for optimal quality
- dynamicRange
dynamic range, dB. Harmonics and noise more than
dynamicRange under maximum amplitude are discarded to save computational
resources
- invalidArgAction
what to do if an argument is invalid or outside the
range in permittedValues
: 'adjust' = reset to default value, 'abort'
= stop execution, 'ignore' = throw a warning and continue (may crash)
- plot
if TRUE, plots a spectrogram
- play
if TRUE, plays the synthesized sound using the default player on
your system. If character, passed to play
as the name
of player to use, eg "aplay", "play", "vlc", etc. In case of errors, try
setting another default player for play
- saveAudio
path + filename for saving the output, e.g.
'~/Downloads/temp.wav'. If NULL = doesn't save
- ...
other plotting parameters passed to spectrogram