This function validates the input parameters required for initializing a catalytic Generalized Linear Model (GLM). It ensures the appropriate structure and compatibility of the formula, family, data, and additional parameters before proceeding with further modeling.
validate_glm_initialization_input(
formula,
family,
data,
syn_size,
custom_variance,
gaussian_known_variance,
x_degree
)Returns nothing if all checks pass; otherwise, raises an error or warning.
A formula object specifying the stats::glm model to be fitted. It must not contain random effects or survival terms.
A character or family object specifying the error distribution and link function. Valid values are "binomial" and "gaussian".
A data.frame containing the data to be used in the GLM.
A positive integer specifying the sample size used for the synthetic data.
A positive numeric value for the custom variance used in the model (only applicable for Gaussian family).
A logical indicating whether the variance is known for the Gaussian family.
A numeric vector specifying the degree of the predictors. Its length should match the number of predictors (excluding the response variable).
This function performs the following checks:
Ensures that syn_size, custom_variance, and x_degree are positive values.
Verifies that the provided formula is suitable for GLMs, ensuring no random effects or survival terms.
Checks that the provided data is a data.frame.
Confirms that the formula does not contain too many terms relative to the number of columns in data.
Ensures that the family is either "binomial" or "gaussian".
Validates that x_degree has the correct length relative to the number of predictors in data.
Warns if syn_size is too small relative to the number of columns in data.
Issues warnings if custom_variance or gaussian_known_variance are used with incompatible families.
If any of these conditions are not met, the function raises an error or warning to guide the user.