Learn R Programming

whisper (version 0.1.0)

audio_to_mel: Convert Audio to Mel Spectrogram

Description

Main preprocessing function that converts audio to the mel spectrogram format expected by Whisper.

Usage

audio_to_mel(file, n_mels = 80L, device = "auto", dtype = "auto")

Value

torch tensor of shape (1, n_mels, 3000) for 30s audio

Arguments

file

Path to audio file, or numeric vector of audio samples

n_mels

Number of mel bins (80 for most models, 128 for large-v3)

device

torch device for output tensor

dtype

torch dtype for output tensor

Examples

Run this code
# \donttest{
# Convert audio file to mel spectrogram
audio_file <- system.file("audio", "jfk.mp3", package = "whisper")
mel <- audio_to_mel(audio_file)
dim(mel)
# }

Run the code above in your browser using DataLab