Implements the SIC \(\epsilon\)-telescope method, either using single or multiparameter regression. Returns estimated coefficients, estimated standard errors and the value of the penalized likelihood function. Note that the function will scale the predictors to have unit variance, however, the final estimates are converted back to their original scale.
smoothic(
formula,
data,
family = "sgnd",
model = "mpr",
lambda = "log(n)",
epsilon_1 = 10,
epsilon_T = 1e-04,
steps_T = 100,
zero_tol = 1e-05,
max_it = 10000,
kappa,
tau,
max_it_vec,
stepmax_nlm
)A list with estimates and estimated standard errors.
coefficients - vector of coefficients.
see - vector of estimated standard errors.
model - the matched type of model which is called.
plike - value of the penalized likelihood function.
kappa - value of the estimated/fixed shape parameter kappa if family = "sgnd".
An object of class "formula": a two-sided object
with response on the left hand side and the model variables on the right hand side.
A data frame containing the variables in the model; the data frame should be unstandardized.
The family of the model, default is family = "sgnd" for the
"Smooth Generalized Normal Distribution" where the shape parameter kappa is also
estimated. Classical regression with normally distributed errors is performed
when family = "normal". If family = "laplace", this corresponds to
a robust regression with errors from a Laplace-like distribution. If family = "laplace",
then the default value of tau = 0.15, which is used to approximate the absolute value
in the Laplace density function.
The type of regression to be implemented, either model = "mpr"
for multiparameter regression (i.e., location and scale), or model = "spr" for single parameter
regression (i.e., location only). Defaults to model="mpr".
Value of penalty tuning parameter. Suggested values are
"log(n)" and "2" for the BIC and AIC respectively. Defaults to
lambda ="log(n)" for the BIC case. This is evaluated as an R expression, so it may
be a number of some function of n.
Starting value for \(\epsilon\)-telescope. Defaults to 10.
Final value for \(\epsilon\)-telescope. Defaults to
1e-04.
Number of steps in \(\epsilon\)-telescope. Defaults to 100, must be greater than or equal to 10.
Coefficients below this value are treated as being zero.
Defaults to 1e-05.
Maximum number of iterations to be performed before the
optimization is terminated. Defaults to 1e+04.
Optional user-supplied positive kappa value (> 0.2 to avoid
computational issues) if family = "sgnd". If supplied, the shape parameter
kappa will be fixed to this value in the optimization. If not supplied, kappa is
estimated from the data.
Optional user-supplied positive smoothing parameter value in the
"Smooth Generalized Normal Distribution" if family = "sgnd" or
family = "laplace". If not supplied then tau = 0.15. If family = "normal"
then tau = 0 is used. Smaller values of tau bring the approximation closer to the
absolute value function, but this can cause the optimization to become unstable. Some issues with
standard error calculation with smaller values of tau when using the Laplace distribution in
the robust regression setting.
Optional vector of length steps_T that contains the maximum number of
iterations to be performed in each \(\epsilon\)-telescope step. If not supplied, max_it
is the maximum number of iterations performed for 10 steps and then the maximum number of iterations
to be performed reduces to 10 for the remainder of the telescope.
Optional maximum allowable scaled step length (positive scalar) to be passed to
nlm. If not supplied, default values in
nlm are used.
Meadhbh O'Neill
O'Neill, M. and Burke, K. (2023) Variable selection using a smooth information criterion for distributional regression models. <doi:10.1007/s11222-023-10204-8>
O'Neill, M. and Burke, K. (2022) Robust Distributional Regression with Automatic Variable Selection. <arXiv:2212.07317>
# Sniffer Data --------------------
# MPR Model ----
results <- smoothic(
formula = y ~ .,
data = sniffer,
family = "normal",
model = "mpr"
)
summary(results)
Run the code above in your browser using DataLab