soundgen (version 1.5.0)

spectrogram: Spectrogram

Description

Produces the spectrogram of a sound using short-term Fourier transform. Inspired by spectro, this function offers added routines for noise reduction, smoothing in time and frequency domains, manual control of contrast and brightness, plotting the oscillogram on a dB scale, grid, etc.

Usage

spectrogram(x, samplingRate = NULL, dynamicRange = 80,
  windowLength = 50, step = NULL, overlap = 70, wn = "gaussian",
  zp = 0, normalize = TRUE, smoothFreq = 0, smoothTime = 0,
  qTime = 0, percentNoise = 10, noiseReduction = 0, contrast = 0.2,
  brightness = 0, method = c("spectrum", "spectralDerivative")[1],
  output = c("none", "original", "processed", "complex")[1],
  ylim = NULL, plot = TRUE, osc = FALSE, osc_dB = FALSE,
  heights = c(3, 1), colorTheme = c("bw", "seewave", "...")[1],
  xlab = "Time, ms", ylab = "Frequency, kHz", mar = c(5.1, 4.1, 4.1,
  2), main = "", grid = NULL, frameBank = NULL, duration = NULL,
  ...)

Arguments

x

path to a .wav or .mp3 file or a vector of amplitudes with specified samplingRate

samplingRate

sampling rate of x (only needed if x is a numeric vector, rather than an audio file)

dynamicRange

dynamic range, dB. All values more than one dynamicRange under maximum are treated as zero

windowLength

length of FFT window, ms

step

you can override overlap by specifying FFT step, ms

overlap

overlap between successive FFT frames, %

wn

window type: gaussian, hanning, hamming, bartlett, rectangular, blackman, flattop

zp

window length after zero padding, points

normalize

if TRUE, scales input prior to FFT

smoothFreq, smoothTime

length of the window, in data points (0 to +inf), for calculating a rolling median. Applies median smoothing to spectrogram in frequency and time domains, respectively

qTime

the quantile to be subtracted for each frequency bin. For ex., if qTime = 0.5, the median of each frequency bin (over the entire sound duration) will be calculated and subtracted from each frame (see examples)

percentNoise

percentage of frames (0 to 100%) used for calculating noise spectrum

noiseReduction

how much noise to remove (0 to +inf, recommended 0 to 2). 0 = no noise reduction, 2 = strong noise reduction: \(spectrum - (noiseReduction * noiseSpectrum)\), where noiseSpectrum is the average spectrum of frames with entropy exceeding the quantile set by percentNoise

contrast

spectrum is exponentiated by contrast (-inf to +inf, recommended -1 to +1). Contrast >0 increases sharpness, <0 decreases sharpness

brightness

how much to "lighten" the image (>0 = lighter, <0 = darker)

method

plot spectrum ('spectrum') or spectral derivative ('spectralDerivative')

output

specifies what to return: nothing ('none'), unmodified spectrogram ('original'), denoised and/or smoothed spectrogram ('processed'), or unmodified spectrogram with the imaginary part giving phase ('complex')

ylim

frequency range to plot, kHz (defaults to 0 to Nyquist frequency)

plot

should a spectrogram be plotted? TRUE / FALSE

osc, osc_dB

should an oscillogram be shown under the spectrogram? TRUE/ FALSE. If `osc_dB`, the oscillogram is displayed on a dB scale. See osc_dB for details

heights

a vector of length two specifying the relative height of the spectrogram and the oscillogram

colorTheme

black and white ('bw'), as in seewave package ('seewave'), or any palette from palette such as 'heat.colors', 'cm.colors', etc

xlab, ylab, main, mar

graphical parameters

grid

if numeric, adds n = grid dotted lines per kHz

frameBank

ignore (only needed for pitch tracking)

duration

ignore (only needed for pitch tracking)

...

other graphical parameters

Value

Returns nothing (if output = 'none'), absolute - not power! - spectrum (if output = 'original'), denoised and/or smoothed spectrum (if output = 'processed'), or spectral derivatives (if method = 'spectralDerivative') as a matrix of real numbers.

Examples

Run this code
# NOT RUN {
# synthesize a sound 1 s long, with gradually increasing hissing noise
sound = soundgen(sylLen = 1000, temperature = 0.001, noise = list(
  time = c(0, 1300), value = c(-40, 0)), formantsNoise = list(
  f1 = list(freq = 5000, width = 10000)))
# playme(sound, samplingRate = 16000)

# basic spectrogram
spectrogram(sound, samplingRate = 16000)

# add bells and whistles
spectrogram(sound, samplingRate = 16000,
  osc = TRUE, noiseReduction = 1.1,
  brightness = -1, colorTheme = 'heat.colors',
  ylim = c(0, 5), cex.lab = .75,
  main = 'My spectrogram',
  yaxp = c(0, 8, 16),  # tip: for base graphics, see ?par
  grid = 5  # tip: to customize, add manually with graphics::grid()
)

# }
# NOT RUN {
# change dynamic range
spectrogram(sound, samplingRate = 16000, dynamicRange = 40)
spectrogram(sound, samplingRate = 16000, dynamicRange = 120)

# add an oscillogram
spectrogram(sound, samplingRate = 16000, osc = TRUE)

# oscillogram on a dB scale, same height as spectrogram
spectrogram(sound, samplingRate = 16000,
            osc_dB = TRUE, heights = c(1, 1))

# broad-band instead of narrow-band
spectrogram(sound, samplingRate = 16000, windowLength = 5)

# focus only on values in the upper 5% for each frequency bin
spectrogram(sound, samplingRate = 16000, qTime = 0.95)

# detect 10% of the noisiest frames based on entropy and remove the pattern
# found in those frames (in this cases, breathing)
spectrogram(sound, samplingRate = 16000,  noiseReduction = 1.1,
  brightness = -2)  # white noise attenuated

# apply median smoothing in both time and frequency domains
spectrogram(sound, samplingRate = 16000, smoothFreq = 5,
  smoothTime = 5)

# increase contrast, reduce brightness
spectrogram(sound, samplingRate = 16000, contrast = 1, brightness = -1)
# }

Run the code above in your browser using DataLab