build_feature_NN: Build and compile a neural network feature model

Description

Builds and compiles a keras neural network for a single smooth term in a neuralGAM model.

The network can optionally be configured to output symmetric prediction intervals (lower bound, upper bound, and mean prediction) using a custom quantile loss (make_quantile_loss()), or a standard single-output point prediction using any user-specified loss function.

When uncertainty_method is aleatoric or both the model outputs three units corresponding to the lower bound, upper bound, and mean prediction, and is compiled with make_quantile_loss(alpha, mean_loss, ...). In any other case, the model outputs a single unit (point prediction) and uses the loss function provided in loss.

Usage

build_feature_NN(
  num_units,
  learning_rate = 0.001,
  activation = "relu",
  kernel_initializer = "glorot_normal",
  kernel_regularizer = NULL,
  bias_regularizer = NULL,
  bias_initializer = "zeros",
  activity_regularizer = NULL,
  loss = "mse",
  name = NULL,
  alpha = 0.05,
  w_mean = 0.1,
  order_penalty_lambda = 0,
  uncertainty_method = "none",
  dropout_rate = 0.1,
  seed = NULL,
  ...
)

Value

A compiled keras_model object ready for training.

Arguments

num_units

Integer or vector of integers. Number of units in the hidden layer(s). If a vector is provided, multiple dense layers are added sequentially.

learning_rate

Numeric. Learning rate for the Adam optimizer.

activation

Character string or function. Activation function to use in hidden layers. If character, it must be valid for tf$keras$activations$get().

kernel_initializer

Keras initializer object or string. Kernel initializer for dense layers.

kernel_regularizer

Optional Keras regularizer for kernel weights.

bias_regularizer

Optional Keras regularizer for bias terms.

bias_initializer

Keras initializer object or string. Initializer for bias terms.

activity_regularizer

Optional Keras regularizer for layer activations.

loss

Loss function to use.

When uncertainty_method is aleatoric or both, this is the mean-head loss inside make_quantile_loss() and can be any keras built-in loss name (e.g., "mse", "mae", "huber", "logcosh", ...) or a custom function.
In any other case, this is used directly in compile().

name

Optional character string. Name assigned to the model.

alpha

Numeric. Desired significance level for symmetric prediction intervals. Defaults to 0.05 (i.e., 95% PI using quantiles alpha/2 and 1-alpha/2).

w_mean

Non-negative numeric. Weight for the mean-head loss within the composite PI loss.

order_penalty_lambda

Non-negative numeric. Strength of a soft monotonicity penalty ReLU(lwr - upr) to discourage interval inversions.

uncertainty_method

Character string indicating the type of uncertainty to estimate in prediction intervals. Must be one of "none", "aleatoric", "epistemic", or "both".

dropout_rate

Numeric in (0,1). Dropout rate used when uncertainty_method %in% c("epistemic","both").

seed

Random seed.

...

Arguments passed on to neuralGAM

formula: Model formula. Smooth terms must be wrapped in s(...). You can specify per-term NN settings, e.g.: y ~ s(x1, num_units = 1024) + s(x3, num_units = c(1024, 512)).

data

Data frame containing the variables.

family

Response distribution: "gaussian", "binomial", "poisson".

kernel_initializer,bias_initializer

Initializers for weights and biases.

kernel_regularizer,bias_regularizer,activity_regularizer

Optional Keras regularizers.

forward_passes

Integer. Number of MC-dropout forward passes used when uncertainty_method %in% c("epistemic","both").

validation_split

Optional fraction of training data used for validation.

w_train

Optional training weights.

bf_threshold

Convergence criterion of the backfitting algorithm. Defaults to 0.001

ls_threshold

Convergence criterion of the local scoring algorithm. Defaults to 0.1

max_iter_backfitting

An integer with the maximum number of iterations of the backfitting algorithm. Defaults to 10.

max_iter_ls

An integer with the maximum number of iterations of the local scoring Algorithm. Defaults to 10.

verbose

Verbosity: 0 silent, 1 progress messages.

Author

Ines Ortega-Fernandez, Marta Sestelo

Details

Prediction interval mode (uncertainty_method %in% c("aleatoric", "both")):

Output layer has 3 units:
- lwr: lower bound, $\tau = \alpha/2$
- upr: upper bound, $\tau = 1 - \alpha/2$
- y_hat: mean prediction
Loss function is make_quantile_loss() which combines two pinball losses (for lower and upper quantiles) with the chosen mean prediction loss and an optional non-crossing penalty.

Point prediction mode (uncertainty_method %in% c("none", "epistemic")):

Output layer has 1 unit: point prediction only.
Loss function is the one passed in loss.

References

Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv:1412.6980. Koenker, R., & Bassett Jr, G. (1978). Regression quantiles. Econometrica, 46(1), 33-50.