Learn R Programming

ggmlR (version 0.6.1)

ggml_layer_lstm: Add an LSTM Layer

Description

Long Short-Term Memory recurrent layer. Implemented as an unrolled computation graph (BPTT) so that ggml's automatic differentiation works without any C extensions.

Usage

ggml_layer_lstm(
  model,
  units,
  return_sequences = FALSE,
  activation = "tanh",
  recurrent_activation = "sigmoid",
  input_shape = NULL,
  name = NULL,
  trainable = TRUE
)

Value

Updated model or a new ggml_tensor_node.

Arguments

model

A ggml_sequential_model or ggml_tensor_node.

units

Integer, number of hidden units.

return_sequences

Logical; if TRUE return all hidden states, otherwise return only the last hidden state.

activation

Activation for the cell gate (default "tanh").

recurrent_activation

Activation for the recurrent step (default "sigmoid").

input_shape

Input shape c(seq_len, input_size) -- required for the first layer only.

name

Optional layer name.

trainable

Logical.

Weight layout

  • W_gates [input_size, 4*units] — input kernel for all four gates (i, f, g, o) concatenated.

  • U_gates [units, 4*units] — recurrent kernel.

  • b_gates [4*units] — bias.

Input / output shapes

Input: [seq_len, input_size] per sample (R row-major), or a 3-D array [N, seq_len, input_size]. In the Functional API the input node shape should be c(seq_len, input_size).

Output (Sequential): [units] per sample when return_sequences = FALSE (default), or c(seq_len, units) when return_sequences = TRUE.

Examples

Run this code
# \donttest{
model <- ggml_model_sequential() |>
  ggml_layer_lstm(64L, input_shape = c(10L, 32L)) |>
  ggml_layer_dense(10L, activation = "softmax")
# }

Run the code above in your browser using DataLab