
Prepares a spectral envelope for filtering a sound to add formants, lip radiation, and some stochastic component regulated by temperature. Formants are specified as a list containing time, frequency, amplitude, and width values for each formant (see examples). See vignette('sound_generation', package = 'soundgen') for more information.
getSpectralEnvelope(nr, nc, formants = NA, formantDep = 1,
formantWidth = 1, lipRad = 6, noseRad = 4, mouthAnchors = NA,
interpol = c("approx", "spline", "loess")[3], mouthOpenThres = 0.2,
openMouthBoost = 0, vocalTract = NULL, temperature = 0.05,
formDrift = 0.3, formDisp = 0.2, formantDepStoch = 20,
smoothLinearFactor = 1, samplingRate = 16000, speedSound = 35400,
plot = FALSE, duration = NULL, colorTheme = c("bw", "seewave",
"...")[1], nCols = 100, xlab = "Time", ylab = "Frequency, kHz",
...)
the number of frequency bins = windowLength_points/2, where windowLength_points is the size of window for Fourier transform
the number of time steps for Fourier transform
a character string like "aaui" referring to default presets
for speaker "M1"; a vector of formant frequencies; or a list of formant
times, frequencies, amplitudes, and bandwidths, with a single value of each
for static or multiple values of each for moving formants. formants =
NA
defaults to schwa. Time stamps for formants and mouthOpening can be
specified in ms or an any other arbitrary scale.
scale factor of formant amplitude (1 = no change relative
to amplitudes in formants
)
= scale factor of formant bandwidth (1 = no change)
the effect of lip radiation on source spectrum, dB/oct (the default of +6 dB/oct produces a high-frequency boost when the mouth is open)
the effect of radiation through the nose on source spectrum,
dB/oct (the alternative to lipRad
when the mouth is closed)
mouth opening (0 to 1, 0.5 = neutral, i.e. no modification) (anchor format)
the method of smoothing envelopes based on provided mouth anchors: 'approx' = linear interpolation, 'spline' = cubic spline, 'loess' (default) = polynomial local smoothing function. NB: this does NOT affect the smoothing of formant anchors
open the lips (switch from nose radiation to lip
radiation) when the mouth is open >mouthOpenThres
, 0 to 1
amplify the voice when the mouth is open by
openMouthBoost
dB
the length of vocal tract, cm. Used for calculating formant
dispersion (for adding extra formants) and formant transitions as the mouth
opens and closes. If NULL
or NA
, the length is estimated
based on specified formant frequencies (if any)
hyperparameter for regulating the amount of stochasticity in sound generation
scale factor regulating the effect of temperature on the depth of random drift of all formants (user-defined and stochastic): the higher, the more formants drift at a given temperature
scale factor regulating the effect of temperature on the irregularity of the dispersion of stochastic formants: the higher, the more unevenly stochastic formants are spaced at a given temperature
the amplitude of additional formants added above the highest specified formant (only if temperature > 0)
regulates smoothing of formant anchors (0 to +Inf)
as they are upsampled to the number of fft steps nc
. This is
necessary because the input formants
normally contains fewer
sets of formant values than the number of fft steps.
smoothLinearFactor
= 0: close to default spline; >3: approaches
linear extrapolation
sampling frequency, Hz
speed of sound in warm air, cm/s. Stevens (2000) "Acoustic phonetics", p. 138
if TRUE, produces a plot of the spectral envelope
duration of the sound, ms (for plotting purposes only)
black and white ('bw'), as in seewave package ('seewave'), or another color theme (e.g. 'heat.colors')
number of colors in the palette
labels of axes
other graphical parameters passed on to image()
Returns a spectral filter (matrix nr x nc, where nr is the number of frequency bins = windowLength_points/2 and nc is the number of time steps)
# NOT RUN {
# [a] with F1-F3 visible
e = getSpectralEnvelope(nr = 512, nc = 50,
formants = soundgen:::convertStringToFormants('a'),
temperature = 0, plot = TRUE)
# image(t(e)) # to plot the output on a linear scale instead of dB
# some "wiggling" of specified formants plus extra formants on top
e = getSpectralEnvelope(nr = 512, nc = 50,
formants = soundgen:::convertStringToFormants('a'),
temperature = 0.1, formantDepStoch = 20, plot = TRUE)
# a schwa based on the length of vocal tract = 15.5 cm
e = getSpectralEnvelope(nr = 512, nc = 50, formants = NA,
temperature = .1, vocalTract = 15.5, plot = TRUE)
# no formants at all, only lip radiation
e = getSpectralEnvelope(nr = 512, nc = 50,
formants = NA, temperature = 0, plot = TRUE)
# mouth opening
e = getSpectralEnvelope(nr = 512, nc = 50,
vocalTract = 16, plot = TRUE, lipRad = 6, noseRad = 4,
mouthAnchors = data.frame(time = c(0, .5, 1), value = c(0, 0, .5)))
# scale formant amplitude and/or bandwidth
e = getSpectralEnvelope(nr = 512, nc = 50,
formants = soundgen:::convertStringToFormants('a'),
formantWidth = 2, formantDep = .5,
temperature = 0, plot = TRUE)
# manual specification of formants
e = getSpectralEnvelope(nr = 512, nc = 50, plot = TRUE, samplingRate = 16000,
formants = list(f1 = data.frame(time = c(0, 1), freq = c(900, 500),
amp = 20, width = c(80, 50)),
f2 = data.frame(time = c(0, 1), freq = c(1200, 2500),
amp = 20, width = 100),
f3 = data.frame(time = 0, freq = 2900,
amp = 20, width = 120)))
# }
Run the code above in your browser using DataLab