nn_module_generator() is a generalized function that generates neural network
module expressions for various architectures. It provides a flexible framework for creating
custom neural network modules by parameterizing layer types, construction arguments, and
forward pass behavior.
While designed primarily for {torch} modules, it can work with custom layer implementations
from the current environment, including user-defined layers like RBF networks, custom
attention mechanisms, or other novel architectures.
This function serves as the foundation for specialized generators like ffnn_generator()
and rnn_generator(), but can be used directly to create custom architectures.
nn_module_generator(
nn_name = "nnModule",
nn_layer = NULL,
out_nn_layer = NULL,
nn_layer_args = list(),
layer_arg_fn = NULL,
forward_extract = NULL,
before_output_transform = NULL,
after_output_transform = NULL,
last_layer_args = list(),
hd_neurons,
no_x,
no_y,
activations = NULL,
output_activation = NULL,
bias = TRUE,
eval = FALSE,
.env = parent.frame(),
...
)If eval = FALSE (default): A language object (unevaluated expression) representing
a torch::nn_module definition. This expression can be evaluated with eval() to
create the module class, which can then be instantiated with eval(result)() to
create a model instance.
If eval = TRUE: An instantiated nn_module class constructor that can be called
directly to create model instances (e.g., result()).
Character string specifying the name of the generated neural network module class.
Default is "nnModule".
The type of neural network layer to use. Can be specified as:
NULL (default): Uses nn_linear() from {torch}
Character string: e.g., "nn_linear", "nn_gru", "nn_lstm", "some_custom_layer"
Named function: A function object that constructs the layer
Anonymous function: e.g., \() nn_linear() or function() nn_linear()
The layer constructor is first searched in the current environment, then in parent
environments, and finally falls back to the {torch} namespace. This allows you to
use custom layer implementations alongside standard torch layers.
Default NULL. If supplied, it forces to be the neural network layer to be used
on the last layer. Can be specified as:
Character string, e.g. "nn_linear", "nn_gru", "nn_lstm", "some_custom_layer"
Named function: A function object that constructs the layer
Formula interface, e.g. ~torch::nn_linear, ~some_custom_layer
Internally, it almost works the same as nn_layer parameter.
Named list of additional arguments passed to the layer constructor
specified by nn_layer. These arguments are applied to all layers. For layer-specific
arguments, use layer_arg_fn. Default is an empty list.
Optional function or formula that generates layer-specific construction arguments. Can be specified as:
Formula: ~ list(input_size = .in, hidden_size = .out) where .in, .out, .i, and .is_output are available
Function: function(i, in_dim, out_dim, is_output) with signature as before
The formula/function should return a named list of arguments to pass to the layer constructor. Available variables in formula context:
.i or i: Integer, the layer index (1-based)
.in or in_dim: Integer, input dimension for this layer
.out or out_dim: Integer, output dimension for this layer
.is_output or is_output: Logical, whether this is the final output layer
If NULL, defaults to FFNN-style arguments: list(in_dim, out_dim, bias = bias).
Optional formula or function that processes layer outputs in the forward pass.
Useful for layers that return complex structures (e.g., RNNs return list(output, hidden)).
Can be specified as:
Formula: ~ .[[1]] or ~ .$output where . represents the layer output
Function: function(expr) that accepts/returns a language object
Common patterns:
Extract first element: ~ .[[1]]
Extract named element: ~ .$output
Extract with method: ~ .$get_output()
If NULL, layer outputs are used directly.
Optional formula or function that transforms input before the output layer. This is applied after the last hidden layer (and its activation) but before the output layer. Can be specified as:
Formula: ~ .[, .$size(2), ] where . represents the current tensor
Function: function(expr) that accepts/returns a language object
Common patterns:
Extract last timestep: ~ .[, .$size(2), ]
Flatten: ~ .$flatten(start_dim = 1)
Global pooling: ~ .$mean(dim = 2)
Extract token: ~ .[, 1, ]
If NULL, no transformation is applied.
Optional formula or function that transforms the output after the output layer.
This is applied after self$out(x) (the final layer) but before returning the result.
Can be specified as:
Formula: ~ .$mean(dim = 2) where . represents the output tensor
Function: function(expr) that accepts/returns a language object
Common patterns:
Global average pooling: ~ .$mean(dim = 2)
Squeeze dimensions: ~ .$squeeze()
Reshape output: ~ .$view(c(-1, 10))
Extract specific outputs: ~ .[, , 1:5]
If NULL, no transformation is applied.
Optional named list or formula specifying additional arguments
for the output layer only. These arguments are appended to the output layer constructor
after the arguments from layer_arg_fn. Can be specified as:
Formula: ~ list(kernel_size = 2L, bias = FALSE)
Named list: list(kernel_size = 2L, bias = FALSE)
This is useful when you need to override or add specific parameters to the final layer
without affecting hidden layers. For example, in CNNs you might want a different kernel
size for the output layer, or in RNNs you might want to disable bias in the final linear
projection. Arguments in last_layer_args will override any conflicting arguments from
layer_arg_fn when .is_output = TRUE. Default is an empty list.
Integer vector specifying the number of neurons (hidden units) in each hidden layer. The length determines the number of hidden layers in the network. Must contain at least one element.
Integer specifying the number of input features (input dimension).
Integer specifying the number of output features (output dimension).
Activation function specifications for hidden layers. Can be:
NULL: No activation functions applied
Character vector: e.g., c("relu", "sigmoid", "tanh")
activation_spec object: Created using act_funs(), which allows
specifying custom arguments. See examples.
If a single activation is provided, it will be replicated across all hidden layers. Otherwise, the length should match the number of hidden layers.
Optional activation function for the output layer.
Same format as activations, but should specify only a single activation.
Common choices include "softmax" for classification or "sigmoid" for
binary outcomes. Default is NULL (no output activation).
Logical indicating whether to include bias terms in layers.
Default is TRUE. Note that this is passed to layer_arg_fn if provided,
so custom layer argument functions should handle this parameter appropriately.
Logical indicating whether to evaluate the generated expression immediately.
If TRUE, returns an instantiated nn_module class that can be called directly
(e.g., model()). If FALSE (default), returns the unevaluated language expression
that can be inspected or evaluated later with eval(). Default is FALSE.
Default is parent.frame(). The environment in which the generated expression is to be evaluated
Additional arguments passed to layer constructors or for future extensions.