powered by
Main preprocessing function that converts audio to the mel spectrogram format expected by Whisper.
audio_to_mel(file, n_mels = 80L, device = "auto", dtype = "auto")
torch tensor of shape (1, n_mels, 3000) for 30s audio
Path to audio file, or numeric vector of audio samples
Number of mel bins (80 for most models, 128 for large-v3)
torch device for output tensor
torch dtype for output tensor
# \donttest{ # Convert audio file to mel spectrogram audio_file <- system.file("audio", "jfk.mp3", package = "whisper") mel <- audio_to_mel(audio_file) dim(mel) # }
Run the code above in your browser using DataLab