Learn R Programming

whisper (version 0.1.0)

whisper_encoder: Audio Encoder

Description

Full Whisper encoder: Conv stem + positional encoding + transformer layers.

Usage

whisper_encoder(n_mels, n_ctx, n_state, n_head, n_layer)

Arguments

n_mels

Number of mel spectrogram bins

n_ctx

Maximum context length (1500 for 30s audio)

n_state

Hidden dimension

n_head

Number of attention heads

n_layer

Number of transformer layers