Produces the spectrogram of a sound using short-term Fourier transform.
Inspired by spectro
, this function offers added
routines for noise reduction, smoothing in time and frequency domains, manual
control of contrast and brightness, plotting the oscillogram on a dB scale,
grid, etc.
spectrogram(
x,
samplingRate = NULL,
dynamicRange = 80,
windowLength = 50,
step = NULL,
overlap = 70,
wn = "gaussian",
zp = 0,
normalize = TRUE,
scale = NULL,
smoothFreq = 0,
smoothTime = 0,
qTime = 0,
percentNoise = 10,
noiseReduction = 0,
contrast = 0.2,
brightness = 0,
method = c("spectrum", "spectralDerivative")[1],
output = c("original", "processed", "complex")[1],
ylim = NULL,
yScale = c("linear", "log")[1],
plot = TRUE,
osc = FALSE,
osc_dB = FALSE,
heights = c(3, 1),
padWithSilence = TRUE,
colorTheme = c("bw", "seewave", "heat.colors", "...")[1],
units = c("ms", "kHz"),
xlab = paste("Time,", units[1]),
ylab = paste("Frequency,", units[2]),
mar = c(5.1, 4.1, 4.1, 2),
main = "",
grid = NULL,
frameBank = NULL,
duration = NULL,
pitch = NULL,
...
)
path to a .wav or .mp3 file or a vector of amplitudes with specified samplingRate
sampling rate of x
(only needed if x
is a
numeric vector, rather than an audio file)
dynamic range, dB. All values more than one dynamicRange under maximum are treated as zero
length of FFT window, ms
you can override overlap
by specifying FFT step, ms
overlap between successive FFT frames, %
window type: gaussian, hanning, hamming, bartlett, rectangular, blackman, flattop
window length after zero padding, points
if TRUE, scales input prior to FFT
maximum possible amplitude of input used for normalization of input vector (not needed if input is an audio file)
length of the window, in data points (0 to +inf), for calculating a rolling median. Applies median smoothing to spectrogram in frequency and time domains, respectively
the quantile to be subtracted for each frequency bin. For ex., if qTime = 0.5, the median of each frequency bin (over the entire sound duration) will be calculated and subtracted from each frame (see examples)
percentage of frames (0 to 100%) used for calculating noise spectrum
how much noise to remove (0 to +inf, recommended 0 to
2). 0 = no noise reduction, 2 = strong noise reduction: percentNoise
spectrum is exponentiated by contrast (-inf to +inf, recommended -1 to +1). Contrast >0 increases sharpness, <0 decreases sharpness
how much to "lighten" the image (>0 = lighter, <0 = darker)
plot spectrum ('spectrum') or spectral derivative ('spectralDerivative')
specifies what to return: nothing ('none'), unmodified spectrogram ('original'), denoised and/or smoothed spectrogram ('processed'), or unmodified spectrogram with the imaginary part giving phase ('complex')
frequency range to plot, kHz (defaults to 0 to Nyquist frequency)
scale of the frequency axis: 'linear' = linear, 'log' = logarithmic
should a spectrogram be plotted? TRUE / FALSE
should an oscillogram be shown under the spectrogram? TRUE/
FALSE. If `osc_dB`, the oscillogram is displayed on a dB scale. See
osc_dB
for details
a vector of length two specifying the relative height of the spectrogram and the oscillogram (including time axes labels)
if TRUE, pads the sound with just enough silence to resolve the edges properly (only the original region is plotted, so apparent duration doesn't change)
black and white ('bw'), as in seewave package ('seewave'),
or any palette from palette
such as 'heat.colors',
'cm.colors', etc
c('ms', 'kHz') is the default, and anything else is interpreted as s (for time) and Hz (for frequency)
graphical parameters
if numeric, adds n = grid
dotted lines per kHz
ignore (only used internally)
other graphical parameters
Returns nothing (if output = 'none'), absolute - not power! - spectrum (if output = 'original'), denoised and/or smoothed spectrum (if output = 'processed'), or spectral derivatives (if method = 'spectralDerivative') as a matrix of real numbers.
Many soundgen functions call spectrogram
, and you can pass along most
of its graphical parameters from functions like soundgen
,
analyze
, etc. However, in some cases this will not work (eg for
"units") or may produce unexpected results. If in doubt, omit extra graphical
parameters.
# NOT RUN {
# synthesize a sound 1 s long, with gradually increasing hissing noise
sound = soundgen(sylLen = 500, temperature = 0.001, noise = list(
time = c(0, 650), value = c(-40, 0)), formantsNoise = list(
f1 = list(freq = 5000, width = 10000)))
# playme(sound, samplingRate = 16000)
# basic spectrogram
spectrogram(sound, samplingRate = 16000)
# }
# NOT RUN {
# add bells and whistles
spectrogram(sound, samplingRate = 16000,
osc = TRUE, # plot oscillogram under the spectrogram
noiseReduction = 1.1, # subtract the spectrum of noisy parts
brightness = -1, # reduce brightness
colorTheme = 'heat.colors', # pick color theme
cex.lab = .75, cex.axis = .75, # text size and other base graphics pars
grid = 5, # lines per kHz; to customize, add manually with graphics::grid()
units = c('s', 'Hz'), # plot in s or ms, Hz or kHz
ylim = c(0, 5000), # in specified units (Hz)
main = 'My spectrogram' # title
# + axis labels, etc
)
# change dynamic range
spectrogram(sound, samplingRate = 16000, dynamicRange = 40)
spectrogram(sound, samplingRate = 16000, dynamicRange = 120)
# add an oscillogram
spectrogram(sound, samplingRate = 16000, osc = TRUE)
# oscillogram on a dB scale, same height as spectrogram
spectrogram(sound, samplingRate = 16000,
osc_dB = TRUE, heights = c(1, 1))
# frequencies on a logarithmic scale
spectrogram(sound, samplingRate = 16000,
yScale = 'log', ylim = c(.05, 8))
# broad-band instead of narrow-band
spectrogram(sound, samplingRate = 16000, windowLength = 5)
# focus only on values in the upper 5% for each frequency bin
spectrogram(sound, samplingRate = 16000, qTime = 0.95)
# detect 10% of the noisiest frames based on entropy and remove the pattern
# found in those frames (in this cases, breathing)
spectrogram(sound, samplingRate = 16000, noiseReduction = 1.1,
brightness = -2) # white noise attenuated
# apply median smoothing in both time and frequency domains
spectrogram(sound, samplingRate = 16000, smoothFreq = 5,
smoothTime = 5)
# increase contrast, reduce brightness
spectrogram(sound, samplingRate = 16000, contrast = 1, brightness = -1)
# specify location of tick marks etc - see ?par() for base graphics
spectrogram(sound, samplingRate = 16000,
ylim = c(0, 3), yaxp = c(0, 3, 5), xaxp = c(0, 1400, 4))
# }
Run the code above in your browser using DataLab