soundgen (version 1.5.0)

analyzeFrame: Analyze fft frame

Description

Internal soundgen function.

Usage

analyzeFrame(frame, autoCorrelation = NULL, samplingRate = 44100,
  scaleCorrection = 1, trackPitch = TRUE, pitchMethods = c("autocor",
  "cep", "spec", "dom"), cutFreq = 6000, domThres = 0.1,
  domSmooth = 220, autocorThres = 0.75, autocorSmooth = NULL,
  cepThres = 0.45, cepSmooth = 3, cepZp = 2^13, specThres = 0.45,
  specPeak = 0.8, specSinglePeakCert = 0.6, specSmooth = 100,
  specHNRslope = 0.1, specMerge = 1, pitchFloor = 75,
  pitchCeiling = 3500, nCands = 1)

Arguments

frame

the real part of the spectrum of a frame, as returned by fft

autoCorrelation

pre-calculated autocorrelation of the input frame (computationally more efficient than to do it here)

samplingRate

sampling rate (Hz)

trackPitch

if TRUE, attempt to find F0 in this frame (FALSE if entropy is above some threshold - specified in analyze)

pitchMethods

methods of pitch estimation to consider for determining pitch contour: 'autocor' = autocorrelation (~PRAAT), 'cep' = cepstral, 'spec' = spectral (~BaNa), 'dom' = lowest dominant frequency band ('' or NULL = no pitch analysis)

cutFreq

(>0 to Nyquist, Hz) repeat the calculation of spectral descriptives after discarding all info above cutFreq. Recommended if the original sampling rate varies across different analyzed audio files

domThres

(0 to 1) to find the lowest dominant frequency band, we do short-term FFT and take the lowest frequency with amplitude at least domThres

domSmooth

the width of smoothing interval (Hz) for finding dom

autocorThres

(0 to 1) separate voicing thresholds for detecting pitch candidates with three different methods: autocorrelation, cepstrum, and BaNa algorithm (see Details). Note that HNR is calculated even for unvoiced frames.

autocorSmooth

the width of smoothing interval (in bins) for finding peaks in the autocorrelation function. Defaults to 7 for sampling rate 44100 and smaller odd numbers for lower values of sampling rate

cepThres

(0 to 1) separate voicing thresholds for detecting pitch candidates with three different methods: autocorrelation, cepstrum, and BaNa algorithm (see Details). Note that HNR is calculated even for unvoiced frames.

cepSmooth

the width of smoothing interval (Hz) for finding peaks in the cepstrum

cepZp

zero-padding of the spectrum used for cepstral pitch detection (final length of spectrum after zero-padding in points, e.g. 2 ^ 13)

specThres

(0 to 1) separate voicing thresholds for detecting pitch candidates with three different methods: autocorrelation, cepstrum, and BaNa algorithm (see Details). Note that HNR is calculated even for unvoiced frames.

specPeak

when looking for putative harmonics in the spectrum, the threshold for peak detection is calculated as specPeak * (1 - HNR * specHNRslope)

specSinglePeakCert

(0 to 1) if F0 is calculated based on a single harmonic ratio (as opposed to several ratios converging on the same candidate), its certainty is taken to be specSinglePeakCert

specSmooth

the width of window for detecting peaks in the spectrum, Hz

specHNRslope

when looking for putative harmonics in the spectrum, the threshold for peak detection is calculated as specPeak * (1 - HNR * specHNRslope)

specMerge

pitch candidates within specMerge semitones are merged with boosted certainty

pitchFloor

absolute bounds for pitch candidates (Hz)

pitchCeiling

absolute bounds for pitch candidates (Hz)

nCands

maximum number of pitch candidates per method (except for dom, which returns at most one candidate per frame), normally 1...4

Value

Returns a list with two components: $pitchCands_frame contains pitch candidates for the frame, and $summaries contains other acoustic predictors like HNR, specSlope, etc.

Details

This function performs the heavy lifting of pitch tracking and acoustic analysis in general: it takes the spectrum of a single fft frame as input and analyzes it.