Learn R Programming

ggmlR (version 0.6.1)

ggml_diag_mask_inf: Diagonal Mask with -Inf (Graph)

Description

Creates a graph node that sets elements above the diagonal to -Inf. This is used for causal (autoregressive) attention masking.

Usage

ggml_diag_mask_inf(ctx, a, n_past)

Value

Tensor with same shape as input, elements above diagonal set to -Inf

Arguments

ctx

GGML context

a

Input tensor (typically attention scores)

n_past

Number of past tokens (shifts the diagonal). Use 0 for standard causal masking where position i can only attend to positions <= i.

Details

In causal attention, we want each position to only attend to itself and previous positions. Setting future positions to -Inf ensures that after softmax, they contribute 0 attention weight.

The n_past parameter allows for KV-cache scenarios where the diagonal needs to be shifted to account for previously processed tokens.

Examples

Run this code
# \donttest{
ctx <- ggml_init(16 * 1024 * 1024)
# Create attention scores matrix
scores <- ggml_new_tensor_2d(ctx, GGML_TYPE_F32, 4, 4)
ggml_set_f32(scores, rep(1, 16))
# Apply causal mask
masked <- ggml_diag_mask_inf(ctx, scores, 0)
graph <- ggml_build_forward_expand(ctx, masked)
ggml_graph_compute(ctx, graph)
ggml_free(ctx)
# }

Run the code above in your browser using DataLab