Learn R Programming

lgspline (version 0.2.0)

lgspline.fit: Low-Level Fitting for Lagrangian Smoothing Splines

Description

The core function for fitting Lagrangian smoothing splines with less user-friendliness.

Usage

lgspline.fit(predictors, y = NULL, standardize_response = TRUE,
             standardize_predictors_for_knots = TRUE,
             standardize_expansions_for_fitting = TRUE, family = gaussian(),
             glm_weight_function = function(mu, y, order_indices, family,
                                            dispersion, observation_weights,
                                            ...) {
               if(any(!is.null(observation_weights))){
                 family$variance(mu) * observation_weights
               } else {
                 family$variance(mu)
               }
             },
             shur_correction_function = function(X, y, B, dispersion, order_list,
                                                 K, family, observation_weights,
                                                 ...) {
               lapply(1:(K+1), function(k) 0)
             },
             need_dispersion_for_estimation = FALSE,
             dispersion_function = function(mu, y, order_indices, family,
                                            observation_weights, ...) { 1 },
             K = NULL, custom_knots = NULL, cluster_on_indicators = FALSE,
             make_partition_list = NULL, previously_tuned_penalties = NULL,
             smoothing_spline_penalty = NULL, opt = TRUE, use_custom_bfgs = TRUE,
             delta = NULL, tol = 10*sqrt(.Machine$double.eps),
             invsoftplus_initial_wiggle = c(-25, 20, -15, -10, -5),
             invsoftplus_initial_flat = c(-14, -7), wiggle_penalty = 2e-07,
             flat_ridge_penalty = 0.5, unique_penalty_per_partition = TRUE,
             unique_penalty_per_predictor = TRUE, meta_penalty = 1e-08,
             predictor_penalties = NULL, partition_penalties = NULL,
             include_quadratic_terms = TRUE, include_cubic_terms = TRUE,
             include_quartic_terms = FALSE, include_2way_interactions = TRUE,
             include_3way_interactions = TRUE,
             include_quadratic_interactions = FALSE,
             offset = c(), just_linear_with_interactions = NULL,
             just_linear_without_interactions = NULL,
             exclude_interactions_for = NULL,
             exclude_these_expansions = NULL, custom_basis_fxn = NULL,
             include_constrain_fitted = TRUE,
             include_constrain_first_deriv = TRUE,
             include_constrain_second_deriv = TRUE,
             include_constrain_interactions = TRUE, cl = NULL, chunk_size = NULL,
             parallel_eigen = TRUE, parallel_trace = FALSE, parallel_aga = FALSE,
             parallel_matmult = FALSE, parallel_unconstrained = FALSE,
             parallel_find_neighbors = FALSE, parallel_penalty = FALSE,
             parallel_make_constraint = FALSE,
             unconstrained_fit_fxn = unconstrained_fit_default,
             keep_weighted_Lambda = FALSE, iterate_tune = TRUE,
             iterate_final_fit = TRUE, blockfit = FALSE,
             qp_score_function = function(X, y, mu, order_list, dispersion,
                                          VhalfInv, observation_weights, ...) {
               if(!is.null(observation_weights)) {
                 crossprod(X, cbind((y - mu)*observation_weights))
               } else {
                 crossprod(X, cbind(y - mu))
               }
             },
             qp_observations = NULL, qp_Amat = NULL, qp_bvec = NULL, qp_meq = 0,
             qp_positive_derivative = FALSE, qp_negative_derivative = FALSE,
             qp_monotonic_increase = FALSE, qp_monotonic_decrease = FALSE,
             qp_range_upper = NULL, qp_range_lower = NULL, qp_Amat_fxn = NULL,
             qp_bvec_fxn = NULL, qp_meq_fxn = NULL, constraint_values = cbind(),
             constraint_vectors = cbind(), return_G = TRUE, return_Ghalf = TRUE,
             return_U = TRUE, estimate_dispersion = TRUE,
             unbias_dispersion = TRUE,
             return_varcovmat = TRUE, custom_penalty_mat = NULL,
             cluster_args = c(custom_centers = NA, nstart = 10),
             dummy_dividor = 1.2345672152894e-22,
             dummy_adder = 2.234567210529e-18, verbose = FALSE,
             verbose_tune = FALSE, expansions_only = FALSE,
             observation_weights = NULL, do_not_cluster_on_these = c(),
             neighbor_tolerance = 1 + 1e-16, no_intercept = FALSE,
             VhalfInv = NULL, Vhalf = NULL, include_warnings = TRUE, ...)

Value

A list containing the fitted model components, forming the core structure used internally by lgspline and its associated methods. This function is primarily intended for internal use or advanced users needing direct access to fitting components. The returned list contains numerous elements, typically including:

y

The original response vector provided.

ytilde

The fitted values on the original response scale.

X

A list, with each element the design matrix (\(\textbf{X}_k\)) for partition k.

A

The constraint matrix (\(\textbf{A}\)) encoding smoothness and any other linear equality constraints.

B

A list of the final fitted coefficient vectors (\(\boldsymbol{\beta}_k\)) for each partition k, on the original predictor/response scale.

B_raw

A list of fitted coefficient vectors on the internally standardized scale used during fitting.

K, p, q, P, N

Key dimensions: number of internal knots (K), basis functions per partition (p), original predictors (q), total coefficients (P), and sample size (N).

penalties

A list containing the final penalty components used (e.g., Lambda, L1, L2, L_predictor_list, L_partition_list). See compute_Lambda.

knot_scale_transf, knot_scale_inv_transf

Functions to transform predictors to/from the scale used for knot placement.

knots

Matrix or vector of knot locations on the original predictor scale (NULL if K=0 or q > 1).

partition_codes

Vector assigning each original observation to a partition.

partition_bounds

Internal representation of partition boundaries.

make_partition_list

List containing centers, knot midpoints, neighbor info, and assignment function from partitioning (NULL if K=0 or 1D). See make_partitions.

knot_expand_function, assign_partition

Internal functions for partitioning data. See knot_expand_list.

predict

The primary function embedded in the object for generating predictions on new data. See predict.lgspline.

family

The family object or custom list used.

estimate_dispersion, unbias_dispersion

Logical flags related to dispersion estimation settings.

sigmasq_tilde

The estimated (or fixed, if estimate_dispersion=FALSE) dispersion parameter \(\tilde{\sigma}^2\).

backtransform_coefficients, forwtransform_coefficients

Functions to convert coefficients between standardized and original scales.

mean_y, sd_y

Mean and standard deviation used for standardizing the response.

og_order, order_list

Information mapping original data order to partitioned order.

constraint_values, constraint_vectors

User-supplied additional linear equality constraints.

expansion_scales

Scaling factors applied to basis expansions during fitting (if standardize_expansions_for_fitting=TRUE).

take_derivative, take_interaction_2ndderivative, get_all_derivatives_insample

Functions related to computing derivatives of the fitted spline. See take_derivative, take_interaction_2ndderivative, make_derivative_matrix.

numerics, power1_cols, ..., nonspline_cols

Integer vectors storing column indices identifying different types of terms in the basis expansion.

return_varcovmat

Logical indicating if variance matrix calculation was requested.

raw_expansion_names

Original generated names for basis expansion columns (before potential renaming if input predictors had names).

std_X, unstd_X

Functions to standardize/unstandardize design matrices according to expansion_scales.

parallel_cluster_supplied

Logical indicating if a parallel cluster was used.

weights

The original observation weights provided (potentially reformatted).

VhalfInv

The fixed \(\mathbf{V}^{-1/2}\) matrix if supplied for GEEs.

quadprog_list

List containing components related to quadratic programming constraints, if used.

G, Ghalf, U

Matrices related to the variance-covariance structure (\(\mathbf{G}\), \(\mathbf{G}^{1/2}\), \(\mathbf{U}\)), returned if requested via corresponding arguments. See compute_G_eigen and get_U.

trace_XUGX

The trace term \(\text{trace}(\mathbf{X}\mathbf{U}\mathbf{G}\mathbf{X}^{T})\), used for effective degrees of freedom. See compute_trace_UGXX_wrapper.

varcovmat

The final variance-covariance matrix of the estimated coefficients, \(\sigma^2 \mathbf{U}\mathbf{G}\), returned if return_varcovmat = TRUE.

Note that the exact components returned depend heavily on the function arguments (e.g., values of return_G, return_varcovmat, etc.). If expansions_only = TRUE, a much smaller list is returned containing only pre-fitting components needed for inspection or setup (see lgspline).

Arguments

predictors

Default: NULL. Numeric matrix or data frame of predictor variables. Supports direct matrix input or formula interface when used with `data` argument. Must contain numeric predictors, with categorical variables pre-converted to numeric indicators.

y

Default: NULL. Numeric response variable vector representing the target/outcome/dependent variable etc. to be modeled.

standardize_response

Default: TRUE. Logical indicator controlling whether the response variable should be centered by mean and scaled by standard deviation before model fitting. When TRUE, tends to improve numerical stability. Only offered for identity link functions (when family$link == 'identity')

standardize_predictors_for_knots

Default: TRUE. Logical flag determining whether predictor variables should be standardized before knot placement. Ensures consistent knot selection across different predictor scales, and that no one predictor dominates in terms of influence on knot placement. For all expansions (x), standardization corresponds to dividing by the difference in 69 and 31st percentiles = x / (quantile(x, 0.69) - quantile(x, 0.31)).

standardize_expansions_for_fitting

Default: TRUE. Logical switch to standardize polynomial basis expansions during model fitting. Provides computational stability during penalty tuning without affecting statistical inference, as design matrices, variance-covariance matrices, and coefficient estimates are systematically backtransformed after fitting to account for the standardization. If TRUE, \(\textbf{U}\) and \(\textbf{G}\) will remain on the transformed scale, and B_raw as returned will correspond to the coefficients fitted on the expansion-standardized scale.

family

Default: gaussian(). Generalized linear model (GLM) distribution family specifying the error distribution and link function for model fitting. Defaults to Gaussian distribution with identity link. Supports custom family specifications, including user-defined link functions and optional custom tuning loss criteria. Minimally requires 1) family name (family) 2) link name (link) 3) linkfun (link function) 4) linkinv (link function inverse) 5) variance (mean variance relationship function, \(\text{Var}(Y|\mu)\)).

glm_weight_function

Default: function that returns family$variance(mu) * observation_weights if weights exist, family$variance(mu) otherwise. Codes the mean-variance relationship of a GLM or GLM-like model, the diagonal \(\textbf{W}\) matrix of \(\textbf{X}^T\textbf{W}\textbf{X}\) that appears in the information. This can be replaced with a user-specified function. It is used for updating \(\textbf{G} = (\textbf{X}^{T}\textbf{W}\textbf{X} + \textbf{L})^{-1}\) after obtaining constrained estimates of coefficients. This is not used for fitting unconstrained models, but for iterating between updates of \(\textbf{U}\), \(\textbf{G}\), and beta coefficients afterwards.

shur_correction_function

Default: function that returns list of zeros. Advanced function for computing Schur complements \(\textbf{S}\) to add to \(\textbf{G}\) to properly account for uncertainty in dispersion or other nuisance parameter estimation. The effective information becomes \(\textbf{G}^* = (\textbf{G}^{-1} + \textbf{S})^{-1}\).

need_dispersion_for_estimation

Default: FALSE. Logical indicator specifying whether a dispersion parameter is required for coefficient estimation. This is not needed for canonical regular exponential family models, but is often needed otherwise (such as fitting Weibull AFT models).

dispersion_function

Default: function that returns 1. Custom function for estimating the dispersion parameter. Unless need_dispersion_for_estimation is TRUE, this will not affect coefficient estimates.

K

Default: NULL. Integer specifying the number of knot locations for spline partitions. This can intuitively be considered the total number of partitions - 1.

custom_knots

Default: NULL. Optional matrix providing user-specified knot locations in 1-D.

cluster_on_indicators

Default: FALSE. Logical flag determining whether indicator variables should be used for clustering knot locations.

make_partition_list

Default: NULL. Optional list allowing direct specification of custom partition assignments. The intent is that the make_partition_list returned by one model can be supplied here to keep the same knot locations for another.

previously_tuned_penalties

Default: NULL. Optional list of pre-computed penalty components from previous model fits.

smoothing_spline_penalty

Default: NULL. Optional custom smoothing spline penalty matrix for fine-tuned complexity control.

opt

Default: TRUE. Logical switch controlling whether model penalties should be automatically optimized via generalized cross-validation. Turn this off if previously_tuned_penalties are supplied AND desired, otherwise, the previously_tuned_penalties will be ignored.

use_custom_bfgs

Default: TRUE. Logical indicator selecting between a native implementation of damped-BFGS optimization method with analytical gradients or base R's BFGS implementation with finite-difference approximation of gradients.

delta

Default: NULL. Numeric pseudocount used for stabilizing optimization in non-identity link function scenarios.

tol

Default: 10*sqrt(.Machine$double.eps). Numeric convergence tolerance controlling the precision of optimization algorithms.

invsoftplus_initial_wiggle

Default: c(-25, 20, -15, -10, -5). Numeric vector of initial grid points for wiggle penalty optimization, specified on the inverse-softplus (\(\text{softplus}(x) = \log(1+e^x)\)) scale.

invsoftplus_initial_flat

Default: c(-7, 0). Numeric vector of initial grid points for ridge penalty optimization, specified on the inverse-softplus (\(\text{softplus}(x) = \log(1+e^x)\)) scale.

wiggle_penalty

Default: 2e-7. Numeric penalty controlling the integrated squared second derivative, governing function smoothness. Applied to both smoothing spline penalty (alone) and is multiplied by flat_ridge_penalty for penalizing linear terms.

flat_ridge_penalty

Default: 0.5. Numeric flat ridge penalty for additional regularization on only intercepts and linear terms (won't affect interactions or quadratic/cubic/quartic terms by default). If custom_penalty_mat is supplied, the penalty will be for the custom penalty matrix instead. This penalty is multiplied with wiggle_penalty to obtain the total ridge penalty - hence, by default, the ridge penalization on linear terms is half of the magnitude of non-linear terms.

unique_penalty_per_partition

Default: TRUE. Logical flag allowing the magnitude of the smoothing spline penalty to differ across partition.

unique_penalty_per_predictor

Default: TRUE. Logical flag allowing the magnitude of the smoothing spline penalty to differ between predictors.

meta_penalty

Default: 1e-8. Numeric "meta-penalty" applied to predictor and partition penalties during tuning. The minimization of GCV is modified to be a penalized minimization problem, with penalty \(0.5 \times \text{meta\_penalty} \times (\sum \log(\text{penalty}))^2\), such that penalties are pulled towards 1 on the absolute scale and thus, their multiplicative effect towards 0.

predictor_penalties

Default: NULL. Optional list of custom penalties specified per predictor.

partition_penalties

Default: NULL. Optional list of custom penalties specified per partition.

include_quadratic_terms

Default: TRUE. Logical switch to include squared predictor terms in basis expansions.

include_cubic_terms

Default: TRUE. Logical switch to include cubic predictor terms in basis expansions.

include_quartic_terms

Default: NULL. Logical switch to include quartic predictor terms in basis expansions. This is highly recommended for fitting models with multiple predictors to avoid over-specified constraints. When NULL (by default), will internally set to FALSE if only one predictor present, and TRUE otherwise.

include_2way_interactions

Default: TRUE. Logical switch to include linear two-way interactions between predictors.

include_3way_interactions

Default: TRUE. Logical switch to include three-way interactions between predictors.

include_quadratic_interactions

Default: FALSE. Logical switch to include linear-quadratic interaction terms.

offset

Default: Empty vector. When non-missing, this is a vector of column indices/names to include as offsets. lgspline will automatically introduce constraints such that the coefficient for offset terms are 1.

just_linear_with_interactions

Default: NULL. Integer vector specifying columns to retain linear terms with interactions.

just_linear_without_interactions

Default: NULL. Integer vector specifying columns to retain only linear terms without interactions.

exclude_interactions_for

Default: NULL. Integer vector indicating columns to exclude from all interaction terms.

exclude_these_expansions

Default: NULL. Character vector specifying basis expansions to be excluded from the model. These must be named columns of the data, or in the form "_1_", "_2_", "_1_x_2_", "_2_^2" etc. where "1" and "2" indicate column indices of predictor matrix input.

custom_basis_fxn

Default: NULL. Optional user-defined function for generating custom basis expansions. See get_polynomial_expansions.

include_constrain_fitted

Default: TRUE. Logical switch to constrain fitted values at knot points.

include_constrain_first_deriv

Default: TRUE. Logical switch to constrain first derivatives at knot points.

include_constrain_second_deriv

Default: TRUE. Logical switch to constrain second derivatives at knot points.

include_constrain_interactions

Default: TRUE. Logical switch to constrain interaction terms at knot points.

cl

Default: NULL. Parallel processing cluster object for distributed computation (use parallel::makeCluster()).

chunk_size

Default: NULL. Integer specifying custom fixed chunk size for parallel processing.

parallel_eigen

Default: TRUE. Logical flag to enable parallel processing for eigenvalue decomposition computations.

parallel_trace

Default: FALSE. Logical flag to enable parallel processing for trace computation.

parallel_aga

Default: FALSE. Logical flag to enable parallel processing for specific matrix operations involving G and A.

parallel_matmult

Default: FALSE. Logical flag to enable parallel processing for block-diagonal matrix multiplication.

parallel_unconstrained

Default: TRUE. Logical flag to enable parallel processing for unconstrained maximum likelihood estimation.

parallel_find_neighbors

Default: FALSE. Logical flag to enable parallel processing for neighbor identification (which partitions are neighbors).

parallel_penalty

Default: FALSE. Logical flag to enable parallel processing for penalty matrix construction.

parallel_make_constraint

Default: FALSE. Logical flag to enable parallel processing for constraint matrix generation.

unconstrained_fit_fxn

Default: unconstrained_fit_default. Custom function for fitting unconstrained models per partition.

keep_weighted_Lambda

Default: FALSE. Logical flag to retain generalized linear model weights in penalty constraints using Tikhonov parameterization. It is advised to turn this to TRUE when fitting non-canonical GLMs. The default unconstrained_fit_default by default assumes canonical GLMs for setting up estimating equations; this is not valid with non-canonical GLMs. With keep_weighted_Lambda = TRUE, the Tikhonov parameterization binds \(\boldsymbol{\Lambda}^{1/2}\), the square-root penalty matrix, to the design matrix \(\textbf{X}_k\) for each partition k, and family$linkinv(0) to the response vector \(\textbf{y}_k\) for each partition before finding unconstrained estimates using base R's glm.fit function. The potential issue is that the weights of the information matrix will appear in the penalty, such that the effective penalty is \(\boldsymbol{\Lambda}_\text{eff} = \textbf{L}^{1/2}\textbf{W}\textbf{L}^{1/2}\) rather than just \(\textbf{L}^{1/2}\textbf{L}^{1/2}\). If FALSE, this approach will only be used to supply initial values to a native implementation of damped Newton-Rapshon for fitting GLM models (see damped_newton_r and unconstrained_fit_default). For Gamma with log-link, this is fortunately a non-issue since the mean-variance relationship is essentially stabilized, so keep_weighted_Lambda = TRUE is strongly advised.

iterate_tune

Default: TRUE. Logical switch to use iterative optimization during penalty tuning. If FALSE, \(\textbf{G}\) and \(\textbf{U}\) are constructed from unconstrained \(\boldsymbol{\beta}\) estimates when tuning.

iterate_final_fit

Default: TRUE. Logical switch to use iterative optimization for final model fitting. If FALSE, \(\textbf{G}\) and \(\textbf{U}\) are constructed from unconstrained \(\boldsymbol{\beta}\) estimates when fitting the final model after tuning.

blockfit

Default: FALSE. Logical switch to abandon per-partition fitting for non-spline effects without interactions, collapse all matrices into block-diagonal single-matrix form, and fit agnostic to partition. This would be more efficient for many non-spline effects without interactions and relatively few spline effects or non-spline effects with interactions. Ignored if length(just_linear_without_interactions) = 0 after processing formulas and input.

qp_score_function

Default: \(\textbf{X}^{T}(\textbf{y} - \text{E}[\textbf{y}])\), where \(\text{E}[\textbf{y}] = \boldsymbol{\mu}\). A function returning the score of the log-likelihood for optimization (excluding penalization/priors involving \(\boldsymbol{\Lambda}\)), which is needed for the formulation of quadratic programming problems, when blockfit = TRUE, and correlation-structure fitting for GLMs, all relying on solve.QP. Accepts arguments "X, y, mu, order_list, dispersion, VhalfInv, observation_weights, ..." in order. As shown in the examples below, a gamma log-link model requires \(\textbf{X}^{T}\textbf{W}(\textbf{y} - \text{E}[\textbf{y}])\) instead, with \(\textbf{W}\) a diagonal matrix of \(\text{E}[\textbf{y}]^2\) (Note: This example might be incorrect; check the specific score equation for Gamma log-link). This argument is not needed when fitting non-canonical GLMs without quadratic programming constraints or correlation structures, situations for which keep_weighted_Lambda=TRUE is sufficient.

qp_observations

Default: NULL. Numeric vector of observations to apply constraints to for monotonic and range quadratic programming constraints. Useful for saving computational resources.

qp_Amat

Default: NULL. Constraint matrix for quadratic programming formulation. The Amat argument of solve.QP.

qp_bvec

Default: NULL. Constraint vector for quadratic programming formulation. The bvec argument of solve.QP.

qp_meq

Default: 0. Number of equality constraints in quadratic programming setup. The meq argument of solve.QP.

qp_positive_derivative, qp_monotonic_increase

Default: FALSE. Logical flags to constrain the function to have positive first derivatives/be monotonically increasing using quadratic programming with respect to the order (ascending rows) of the input data set.

qp_negative_derivative, qp_monotonic_decrease

Default: FALSE. Logical flags to constrain the function to have negative first derivatives/be monotonically decreasing using quadratic programming with respect to the order (ascending rows) of the input data set.

qp_range_upper

Default: NULL. Numeric upper bound for constrained fitted values using quadratic programming.

qp_range_lower

Default: NULL. Numeric lower bound for constrained fitted values using quadratic programming.

qp_Amat_fxn

Default: NULL. Custom function for generating Amat matrix in quadratic programming.

qp_bvec_fxn

Default: NULL. Custom function for generating bvec vector in quadratic programming.

qp_meq_fxn

Default: NULL. Custom function for determining meq equality constraints in quadratic programming.

constraint_values

Default: cbind(). Matrix of constraint values for sum constraints. The constraint enforces \(\textbf{C}^T(\boldsymbol{\beta} - \textbf{c}) = \boldsymbol{0}\) in addition to smoothing constraints, where \(\textbf{C}\) = constraint_vectors and \(\textbf{c}\) = constraint_values.

constraint_vectors

Default: cbind(). Matrix of vectors for sum constraints. The constraint enforces \(\textbf{C}^T(\boldsymbol{\beta} - \textbf{c}) = \boldsymbol{0}\) in addition to smoothing constraints, where \(\textbf{C}\) = constraint_vectors and \(\textbf{c}\) = constraint_values.

return_G

Default: TRUE. Logical switch to return the unscaled variance-covariance matrix without smoothing constraints (\(\textbf{G}\)).

return_Ghalf

Default: TRUE. Logical switch to return the matrix square root of the unscaled variance-covariance matrix without smoothing constraints (\(\textbf{G}^{1/2}\)).

return_U

Default: TRUE. Logical switch to return the constraint projection matrix \(\textbf{U}\).

estimate_dispersion

Default: TRUE. Logical flag to estimate the dispersion parameter after fitting.

unbias_dispersion

Default NULL. Logical switch to multiply final dispersion estimates by \(N/(N-\text{trace}(\textbf{X}\textbf{U}\textbf{G}\textbf{X}^{T}))\), which in the case of Gaussian-distributed errors with identity link function, provides unbiased estimates of variance. When NULL (by default), gets set to TRUE for Gaussian + identity link and FALSE otherwise.

return_varcovmat

Default: TRUE. Logical switch to return the variance-covariance matrix of the estimated coefficients. This is needed for performing Wald inference.

custom_penalty_mat

Default: NULL. Optional \(p \times p\) custom penalty matrix for individual partitions to replace the default ridge penalty applied to linear-and-intercept terms only. This can be interpreted as proportional to the prior correlation matrix of coefficients for non-spline effects, and will appear in the penalty matrix for all partitions. It is recommended to first run the function using expansions_only = TRUE so you have an idea of where the expansions appear in each partition, what "p" is, and you can carefully customize your penalty matrix after.

cluster_args

Default: c(custom_centers = NA, nstart = 10). Named vector of arguments controlling clustering procedures. If the first argument is not NA, this will be treated as custom cluster centers and all other arguments ignored. Otherwise, default base R k-means clustering will be used with all other arguments supplied to kmeans (for example, by default, the "nstart" argument as provided). Custom centers must be a \(K \times q\) matrix with one column for each predictor in order of their appearance in input predictor/data, and one row for each center.

dummy_dividor

Default: 0.00000000000000000000012345672152894. Small numeric constant to prevent division by zero in computational routines.

dummy_adder

Default: 0.000000000000000002234567210529. Small numeric constant to prevent division by zero in computational routines.

verbose

Default: FALSE. Logical flag to print general progress messages during model fitting (does not include during tuning).

verbose_tune

Default: FALSE. Logical flag to print detailed progress messages during penalty tuning specifically.

expansions_only

Default: FALSE. Logical switch to return only basis expansions without full model fitting. Useful for setting up custom constraints and penalties.

observation_weights

Default: NULL. Numeric vector of observation-specific weights for generalized least squares estimation.

do_not_cluster_on_these

Default: c(). Vector specifying predictor columns to exclude from clustering procedures, in addition to the non-spline effects by default.

neighbor_tolerance

Default: 1 + 1e-8. Numeric tolerance for determining neighboring partitions using k-means clustering. Greater values means more partitions are likely to be considered neighbors. Intended for internal use only (modify at your own risk!).

no_intercept

Default: FALSE. Logical flag to remove intercept, constraining it to 0. The function automatically constructs constraint_vectors and constraint_values to achieve this. Calling formulas with a "0+" in it like y ~ 0 + . will set this option to TRUE.

VhalfInv

Default: NULL. Matrix representing a fixed, custom square-root-inverse covariance structure for the response variable of longitudinal and spatial modeling. Must be an \(N \times N\) matrix where N is number of observations. This matrix \(\textbf{V}^{-1/2}\) serves as a fixed transformation matrix for the response, equivalent to GLS with known covariance \(\textbf{V}\). This is known as "whitening" in some literature.

Vhalf

Default: NULL. Matrix representing a fixed, custom square-root covariance structure for the response variable of longitudinal and spatial modeling. Must be an \(N \times N\) matrix where N is number of observations. This matrix \(\textbf{V}^{1/2}\) is used when backtransforming coefficients for fitting GLMs with arbitrary correlation structure.

include_warnings

Default: TRUE. Logical switch to control display of warning messages during model fitting.

...

Additional arguments passed to the unconstrained model fitting function.