This function validates the input parameters required for initializing a catalytic Generalized Linear Model (GLM). It ensures the appropriate structure and compatibility of the formula, family, data, and additional parameters before proceeding with further modeling.
validate_glm_initialization_input(
formula,
family,
data,
syn_size,
custom_variance,
gaussian_known_variance,
x_degree
)
Returns nothing if all checks pass; otherwise, raises an error or warning.
A formula object specifying the stats::glm
model to be fitted. It must not contain random effects or survival terms.
A character or family object specifying the error distribution and link function. Valid values are "binomial" and "gaussian".
A data.frame
containing the data to be used in the GLM.
A positive integer specifying the sample size used for the synthetic data.
A positive numeric value for the custom variance used in the model (only applicable for Gaussian family).
A logical indicating whether the variance is known for the Gaussian family.
A numeric vector specifying the degree of the predictors. Its length should match the number of predictors (excluding the response variable).
This function performs the following checks:
Ensures that syn_size
, custom_variance
, and x_degree
are positive values.
Verifies that the provided formula
is suitable for GLMs, ensuring no random effects or survival terms.
Checks that the provided data
is a data.frame
.
Confirms that the formula
does not contain too many terms relative to the number of columns in data
.
Ensures that the family
is either "binomial" or "gaussian".
Validates that x_degree
has the correct length relative to the number of predictors in data
.
Warns if syn_size
is too small relative to the number of columns in data
.
Issues warnings if custom_variance
or gaussian_known_variance
are used with incompatible families.
If any of these conditions are not met, the function raises an error or warning to guide the user.