Learn R Programming

ggmlR (version 0.6.1)

ggml_layer_embedding: Add Embedding Layer

Description

Looks up dense vectors for integer token indices. The input must be an integer matrix of 0-based indices in [0, vocab_size - 1] (use ggml_input(shape, dtype = "int32") in Functional mode).

Usage

ggml_layer_embedding(model, vocab_size, dim, name = NULL, trainable = TRUE)

Value

The model with the embedding layer appended, or a new tensor node.

Arguments

model

A ggml_sequential_model or ggml_tensor_node.

vocab_size

Number of distinct tokens (vocabulary size).

dim

Embedding dimension (vector length per token).

name

Optional layer name.

trainable

Logical; whether embedding weights are updated during training.

Axis order (ggml vs Keras)

ggml stores tensors in column-major order, so the output shape is [dim, seq_len] per sample (ggml convention) rather than [seq_len, dim] as in Keras. When you call ggml_layer_flatten() after embedding the result is the same flattened vector regardless of order, but if you access raw output tensors be aware of this transposition.

Index validation

Indices must be in [0, vocab_size - 1]. Out-of-range values cause undefined behaviour inside the ggml kernel (no bounds check is performed at the R level).

Examples

Run this code
# \donttest{
inp <- ggml_input(shape = 10L, dtype = "int32")
out <- inp |>
  ggml_layer_embedding(vocab_size = 1000L, dim = 32L) |>
  ggml_layer_flatten() |>
  ggml_layer_dense(10L, activation = "softmax")
model <- ggml_model(inputs = inp, outputs = out)
# }

Run the code above in your browser using DataLab