Learn R Programming

sits (version 1.1.0)

.torch_multi_head_attention: Torch module for calculating multi-head attention

Description

In order to calculate attentions with a query, this function takes the dot product of query with the keys and gets scores/weights for the values. Each score/weight the relevance between the query and each key. And you reweight the values with the scores/weights, and take the summation of the reweighted values.

This implementation is based on the code made available by Vivien Garnot https://github.com/VSainteuf/lightweight-temporal-attention-pytorch

Usage

.torch_multi_head_attention(n_heads, d_k, d_in)

Value

An output encoder tensor.

Arguments

n_heads

Number of attention heads.

d_k

Dimension of key tensor.

d_in

Dimension of input values.

Author

Charlotte Pelletier, charlotte.pelletier@univ-ubs.fr

Gilberto Camara, gilberto.camara@inpe.br

Rolf Simoes, rolf.simoes@inpe.br

Felipe Souza, lipecaso@gmail.com

References

Vivien Sainte Fare Garnot and Loic Landrieu, "Lightweight Temporal Self-Attention for Classifying Satellite Image Time Series", https://arxiv.org/abs/2007.00586