Defines a torch module for temporal attention encoding.
In order to calculate attentions with a query, as I said in the last article, this function takes the dot product of query with the keys and gets scores/weights for the values. Each score/weight the relevance between the query and each key. And you reweight the values with the scores/weights, and take the summation of the reweighted values.
This implementation is based on the code made available by Vivien Garnot https://github.com/VSainteuf/lightweight-temporal-attention-pytorch
.torch_scaled_dot_product_attention(temperature, attn_dropout = 0.1)
A list with the .
Weight score of the attention module.
Dropout rate to be applied to the attention module.
Query tensor.
Tensor with keys.
Tensor with values.
Charlotte Pelletier, charlotte.pelletier@univ-ubs.fr
Gilberto Camara, gilberto.camara@inpe.br
Rolf Simoes, rolf.simoes@inpe.br
Felipe Souza, lipecaso@gmail.com
Vivien Sainte Fare Garnot and Loic Landrieu, "Lightweight Temporal Self-Attention for Classifying Satellite Image Time Series", https://arxiv.org/abs/2007.00586