analyzeFolder: Analyze folder

Description

Acoustic analysis of all wav/mp3 files in a folder. See analyze and vignette('acoustic_analysis', package = 'soundgen') for further details. See pitch_app for a more realistic workflow: extract manually corrected pitch contours with pitch_app(), then run analyzeFolder() with these manual contours

Usage

analyzeFolder(
  myfolder,
  htmlPlots = TRUE,
  verbose = TRUE,
  samplingRate = NULL,
  dynamicRange = 80,
  silence = 0.04,
  SPL_measured = 70,
  Pref = 2e-05,
  windowLength = 50,
  step = NULL,
  overlap = 50,
  wn = "gaussian",
  zp = 0,
  cutFreq = NULL,
  formants = list(verify = FALSE),
  nFormants = 3,
  pitchMethods = c("dom", "autocor"),
  pitchManual = NULL,
  entropyThres = 0.6,
  pitchFloor = 75,
  pitchCeiling = 3500,
  priorMean = 300,
  priorSD = 6,
  nCands = 1,
  minVoicedCands = NULL,
  pitchDom = list(),
  pitchAutocor = list(),
  pitchCep = list(),
  pitchSpec = list(),
  pitchHps = list(),
  harmHeight = list(type = "n"),
  shortestSyl = 20,
  shortestPause = 60,
  interpolWin = 75,
  interpolTol = 0.3,
  interpolCert = 0.3,
  pathfinding = c("none", "fast", "slow")[2],
  annealPars = list(maxit = 5000, temp = 1000),
  certWeight = 0.5,
  snakeStep = 0.05,
  snakePlot = FALSE,
  smooth = 1,
  smoothVars = c("pitch", "dom"),
  summary = TRUE,
  summaryFun = c("mean", "median", "sd"),
  plot = FALSE,
  showLegend = TRUE,
  savePlots = FALSE,
  pitchPlot = list(col = rgb(0, 0, 1, 0.75), lwd = 3, showPrior = TRUE),
  ylim = NULL,
  xlab = "Time, ms",
  ylab = "kHz",
  main = NULL,
  width = 900,
  height = 500,
  units = "px",
  res = NA,
  ...
)

Arguments

myfolder

full path to target folder

htmlPlots

if TRUE, saves an html file with clickable plots

verbose

if TRUE, reports progress and estimated time left

samplingRate

sampling rate of x (only needed if x is a numeric vector, rather than an audio file)

dynamicRange

dynamic range, dB. All values more than one dynamicRange under maximum are treated as zero

silence

(0 to 1) frames with RMS amplitude below silence threshold are not analyzed at all. NB: this number is dynamically updated: the actual silence threshold may be higher depending on the quietest frame, but it will never be lower than this specified number.

SPL_measured

sound pressure level at which the sound is presented, dB (set to 0 to skip analyzing subjective loudness)

Pref

reference pressure, Pa

windowLength

length of FFT window, ms

step

you can override overlap by specifying FFT step, ms

overlap

overlap between successive FFT frames, %

window type: gaussian, hanning, hamming, bartlett, rectangular, blackman, flattop

window length after zero padding, points

cutFreq

if specified, spectral descriptives (peakFreq, specCentroid, specSlope, and quartiles) are calculated under cutFreq. Recommended when analyzing recordings with varying sampling rates: set to half the lowest sampling rate to make the spectra more comparable. Note that "entropyThres" applies only to this frequency range, which also affects which frames will not be analyzed with pitchAutocor.

formants

a list of arguments passed to findformants - an external function called to perform LPC analysis

nFormants

the number of formants to extract per STFT frame (0 = no formant analysis)

pitchMethods

methods of pitch estimation to consider for determining pitch contour: 'autocor' = autocorrelation (~PRAAT), 'cep' = cepstral, 'spec' = spectral (~BaNa), 'dom' = lowest dominant frequency band ('' or NULL = no pitch analysis)

pitchManual

normally the output of pitch_app containing a manually corrected pitch contour, ideally with the same windowLength and step as current call to analyzeFolder; a dataframe with at least two columns: "file" (w/o path) and "pitch" (character like "NA, 150, 175, NA")

entropyThres

pitch tracking is not performed for frames with Weiner entropy above entropyThres, but other spectral descriptives are still calculated

pitchFloor

absolute bounds for pitch candidates (Hz)

pitchCeiling

absolute bounds for pitch candidates (Hz)

priorMean

specifies the mean (Hz) and standard deviation (semitones) of gamma distribution describing our prior knowledge about the most likely pitch values for this file. For ex., priorMean = 300, priorSD = 6 gives a prior with mean = 300 Hz and SD = 6 semitones (half an octave)

priorSD

nCands

maximum number of pitch candidates per method (except for dom, which returns at most one candidate per frame), normally 1...4

minVoicedCands

minimum number of pitch candidates that have to be defined to consider a frame voiced (if NULL, defaults to 2 if dom is among other candidates and 1 otherwise)

pitchDom

a list of control parameters for pitch tracking using the lowest dominant frequency band or "dom" method; see details and ?soundgen:::getDom

pitchAutocor

a list of control parameters for pitch tracking using the autocorrelation or "autocor" method; see details and ?soundgen:::getPitchAutocor

pitchCep

a list of control parameters for pitch tracking using the cepstrum or "cep" method; see details and ?soundgen:::getPitchCep

pitchSpec

a list of control parameters for pitch tracking using the BaNa or "spec" method; see details and ?soundgen:::getPitchSpec

pitchHps

a list of control parameters for pitch tracking using the harmonic product spectrum ("hps") method; see details and ?soundgen:::getPitchHps

harmHeight

a list of control parameters for estimating how high harmonics reach in the spectrum; see details and ?soundgen:::harmHeight

shortestSyl

the smallest length of a voiced segment (ms) that constitutes a voiced syllable (shorter segments will be replaced by NA, as if unvoiced)

shortestPause

the smallest gap between voiced syllables (ms) that means they shouldn't be merged into one voiced syllable

interpolWin

control the behavior of interpolation algorithm when postprocessing pitch candidates. To turn off interpolation, set interpolWin = 0. See soundgen:::pathfinder for details.

interpolTol

control the behavior of interpolation algorithm when postprocessing pitch candidates. To turn off interpolation, set interpolWin = 0. See soundgen:::pathfinder for details.

interpolCert

control the behavior of interpolation algorithm when postprocessing pitch candidates. To turn off interpolation, set interpolWin = 0. See soundgen:::pathfinder for details.

pathfinding

method of finding the optimal path through pitch candidates: 'none' = best candidate per frame, 'fast' = simple heuristic, 'slow' = annealing. See soundgen:::pathfinder

annealPars

a list of control parameters for postprocessing of pitch contour with SANN algorithm of optim. This is only relevant if pathfinding = 'slow'

certWeight

(0 to 1) in pitch postprocessing, specifies how much we prioritize the certainty of pitch candidates vs. pitch jumps / the internal tension of the resulting pitch curve

snakeStep

optimized path through pitch candidates is further processed to minimize the elastic force acting on pitch contour. To disable, set snakeStep = 0

snakePlot

if TRUE, plots the snake

smooth

if smooth is a positive number, outliers of the variables in smoothVars are adjusted with median smoothing. smooth of 1 corresponds to a window of ~100 ms and tolerated deviation of ~4 semitones. To disable, set smooth = 0

smoothVars

summary

if TRUE, returns only a summary of the measured acoustic variables (mean, median and SD). If FALSE, returns a list containing frame-by-frame values

summaryFun

a vector of names of functions used to summarize each acoustic characteristic

plot

if TRUE, produces a spectrogram with pitch contour overlaid

showLegend

if TRUE, adds a legend with pitch tracking methods

savePlots

if TRUE, saves plots as .png files

pitchPlot

a list of graphical parameters for displaying the final pitch contour. Set to list(type = 'n') to suppress

ylim

frequency range to plot, kHz (defaults to 0 to Nyquist frequency)

xlab

plotting parameters

ylab

plotting parameters

main

plotting parameters

width

parameters passed to png if the plot is saved

height

parameters passed to png if the plot is saved

units

parameters passed to png if the plot is saved

res

parameters passed to png if the plot is saved

...

other graphical parameters passed to spectrogram

Value

If summary is TRUE, returns a dataframe with one row per audio file. If summary is FALSE, returns a list of detailed descriptives.

Examples

Run this code

# NOT RUN {
# download 260 sounds from Anikin & Persson (2017)
# http://cogsci.se/publications/anikin-persson_2017_nonlinguistic-vocs/260sounds_wav.zip
# unzip them into a folder, say '~/Downloads/temp'
myfolder = '~/Downloads/temp'  # 260 .wav files live here
s = analyzeFolder(myfolder, verbose = TRUE)  # ~ 10-20 minutes!
# s = write.csv(s, paste0(myfolder, '/temp.csv'))  # save a backup

# Check accuracy: import manually verified pitch values (our "key")
# pitchManual   # "ground truth" of mean pitch per sound
# pitchContour  # "ground truth" of complete pitch contours per sound
files_manual = paste0(names(pitchManual), '.wav')
idx = match(s$file, files_manual)  # in case the order is wrong
s$key = pitchManual[idx]

# Compare manually verified mean pitch with the output of analyzeFolder:
cor(s$key, s$pitch_median, use = 'pairwise.complete.obs')
plot(s$key, s$pitch_median, log = 'xy')
abline(a=0, b=1, col='red')

# Re-running analyzeFolder with manually corrected contours gives correct
pitch-related descriptives like amplVoiced and harmonics (NB: you get it "for
free" when running pitch_app)
s1 = analyzeFolder(myfolder, verbose = TRUE, pitchManual = pitchContour)
plot(s$harmonics_median, s1$harmonics_median)
abline(a=0, b=1, col='red')

# Save spectrograms with pitch contours plus an html file for easy access
s2 = analyzeFolder('~/Downloads/temp', savePlots = TRUE,
  showLegend = TRUE, pitchManual = pitchContour,
  width = 20, height = 12,
  units = 'cm', res = 300, ylim = c(0, 5))
# }

Run the code above in your browser using DataLab

Data Engineering and BI courses are free this week!

analyzeFolder: Analyze folder

Description

Usage

Arguments

Value

See Also

Examples