Learn R Programming

transformer (version 0.2.0)

multi_head: Multi-Headed Attention

Description

Multi-Headed Attention

Usage

multi_head(Q, K, V, d_model, num_heads, mask = NULL)

Value

multi-headed attention outputs

Arguments

Q

queries

K

keys

V

values

d_model

dimensions of the model

num_heads

number of heads

mask

optional mask