Fits dummy coefficients of ordinally scaled independent variables with a group lasso penalty on differences of adjacent dummy coefficients.
ordSelect(x, y, u = NULL, z = NULL, offset = rep(0,length(y)), lambda,
model = c("linear", "logit", "poisson", "cumulative"),
restriction = c("refcat", "effect"), penscale = sqrt, scalex = TRUE,
nonpenx = NULL, control = NULL, eps = 1e-3, ...)
An ordPen
object, which is a list containing:
the matrix of fitted response values of the training data.
Columns correspond to different lambda
values.
the matrix of fitted coefficients with respect to dummy-coded (ordinal or nominal) categorical input variables (including the reference category) as well as metric predictors. Columns correspond to different lambda values.
the type of the fitted model: "linear", "logit", "poisson", or "cumulative".
the type of restriction used for identifiability.
the used lambda values.
the used fraction values (NULL
in case of ordSelect
).
a vector giving the number of levels of the ordinal predictors.
a vector giving the number of levels of the nominal predictors (if any).
the number of metric covariates (if any).
the matrix of ordinal predictors, with each column corresponding to one predictor and containing numeric values from {1,2,...}; for each covariate, category 1 is taken as reference category with zero dummy coefficient.
the response vector.
a matrix (or data.frame
) of additional categorical (nominal)
predictors, with each column corresponding to one (additional) predictor and
containing numeric values from {1,2,...}; corresponding dummy coefficients
will not be penalized, and for each covariate category 1 is taken as reference category. Currently not supported if model="cumulative"
.
a matrix (or data.frame
) of additional metric predictors, with
each column corresponding to one (additional) predictor; corresponding
coefficients will not be penalized. Currently not supported if model="cumulative"
.
vector of offset values.
vector of penalty parameters (in decreasing order). Optimization starts with the first component. See details below.
the model which is to be fitted. Possible choices are "linear" (default), "logit", "poisson" or "cumulative". See details below.
identifiability restriction for dummy coding. "reference" takes category 1 is as reference category (default), while with "effect" dummy coefficients sum up to 0 (known as effect coding).
rescaling function to adjust the value of the penalty parameter to the degrees of freedom of the parameter group. See the references below.
logical. Should (split-coded) design matrix corresponding to
x
be scaled to have unit variance over columns before fitting? See details below.
vectors of indices indicating columns of
x
whose regression coefficients are not penalized.
a list of control parameters only if model=="cumulative"
.
a (small) constant to be added to the columnwise standard deviations when scaling the design matrix, to control the effect of very small stds. See details below.
additional arguments.
Jan Gertheiss, Aisouda Hoshiyar
The method assumes that categorical covariates (contained in x
and
u
) take values 1,2,...,max, where max denotes the (columnwise) highest
level observed in the data. If any level between 1 and max is not observed for an ordinal predictor,
a corresponding (dummy) coefficient is fitted anyway. If any level > max is
not observed but possible in principle, and a corresponding coefficient is to
be fitted, the easiest way is to add a corresponding row to x
(and
u
,z
) with corresponding y
value being NA
.
If a linear regression model is fitted, response vector y
may contain
any numeric values; if a logit model is fitted, y
has to be 0/1 coded;
if a poisson model is fitted, y
has to contain count data. If a cumulative logit model is fitted, y
takes values 1,2,...,max.
If scalex
is TRUE
, (split-coded) design matrix constructed from x
is scaled to have
unit variance over columns. If a certain x
-category,
however, is observed only a few times, variances may become very small and
scaling has enormous effects on the result and may cause numerical problems.
Hence a small constant eps
can be added to each standard deviation
when used for scaling.
Gertheiss, J., S. Hogger, C. Oberhauser and G. Tutz (2011). Selection of ordinally scaled independent variables with applications to international classification of functioning core sets. Journal of the Royal Statistical Society C (Applied Statistics), 60, 377-395.
Hoshiyar, A., Gertheiss, L.H., and Gertheiss, J. (2023). Regularization and Model Selection for Item-on-Items Regression with Applications to Food Products' Survey Data. Preprint, available from https://arxiv.org/abs/2309.16373.
Meier, L., S. van de Geer and P. Buehlmann (2008). The group lasso for logistic regression. Journal of the Royal Statistical Society B, 70, 53-71.
Tutz, G. and J. Gertheiss (2014). Rating scales as predictors -- the old question of scale level and some answers. Psychometrika, 79, 357-376.
Tutz, G. and J. Gertheiss (2016). Regularized regression for categorical data. Statistical Modelling, 16, 161-200.
Yuan, M. and Y. Lin (2006). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society B, 68, 49-67.
plot.ordPen
, predict.ordPen
,
ICFCoreSetCWP
# smoothing and selection of ordinal covariates on a simulated dataset
set.seed(123)
# generate (ordinal) predictors
x1 <- sample(1:8,100,replace=TRUE)
x2 <- sample(1:6,100,replace=TRUE)
x3 <- sample(1:7,100,replace=TRUE)
# the response
y <- -1 + log(x1) + sin(3*(x2-1)/pi) + rnorm(100)
# x matrix
x <- cbind(x1,x2,x3)
# lambda values
lambda <- c(1000,500,200,100,50,30,20,10,1)
# smoothing and selection
osl <- ordSelect(x = x, y = y, lambda = lambda)
# results
round(osl$coef,digits=3)
plot(osl)
# If for a certain plot the x-axis should be annotated in a different way,
# this can (for example) be done as follows:
plot(osl, whx = 1, xlim = c(0,9), xaxt = "n")
axis(side = 1, at = c(1,8), labels = c("no agreement","total agreement"))
Run the code above in your browser using DataLab