gvcm.cat(formula, data, family = gaussian, method = c("lqa", "AIC", "BIC"),
tuning = list(lambda=TRUE, specific=FALSE, phi=0.5, grouped.fused=0.5,
elastic=0.5, vs=0.5, spl=0.5), weights, offset, start, control,
model = FALSE, x = FALSE, y = FALSE, plot=FALSE, ...)
pest(x, y, indices, family = gaussian,
tuning = list(lambda=TRUE, specific=FALSE, phi=0.5, grouped.fused=0.5,
elastic=0.5, vs=0.5, spl=0.5), weights, offset, start = NULL,
control = cat_control(), plot=FALSE, ...)
abc(x, y, indices, family = gaussian, tuning = c("AIC", "BIC"),
weights, offset, start, control = cat_control(), plot=FALSE, ...)formula: a symbolic description of the model to be fitted. See detailsfamily object describing the error distribution and link function to be used in the model;
this can be a character string naming a family function, a family function or the result of a call to a family function,
see "lqa", "AIC" or "BIC"; method "lqa" induces penalized estimation;
it employs a PIRLS-algorithm (see Fan and Li, 2001; Oelker and Tutz, 2013).
lambda is the scalar, overall penalty parameter;
if lambda is a vector of values, these values are cross-validated;
if lambda = TRUE, lambdalqacat_control(); see cat_controlgvcm.cat: a logical value indicating whether the employed model frame shall be returned or notgvcm.cat: logical values indicating whether the response vector and the model matrix used in the fitting process shall be returned or not;
for functions pest and abc: y must be a reTRUE, estimates needed to plot coefficient paths are computedpest and abc only: the to be used index argument; see function indexgvcm.cat returns an object of class gvcm.catglmlmgvcm.catcoefficients.reducedmethod="lqa" estimated by the trace of the generalized head matrix; for methods "AIC", "BIC" estimated like default in glm.fitfamily object useddeviance; the null model includes a non-varying intercept onlyrankcontrol argument usedmodel.frame on the special handling of NAs; currently always na.omitplot=TRUE, the first matrix contains estimates needed to plot coefficient paths;
if lambda was cross-validated, the second matrix contains the cross-validation scoreslambda was cross-validated, the optimal value is returnedindexx into its reduced version; e.g. needed for refittingcoefficients into its reduced versionformula suppliedterms object usedmethodqr, R and effects relating to the final weighted linear fit.formula has the form response ~ 1 + terms; where response is the response vector and terms is a series of terms which specifies a linear predictor.
There are some special terms for regularized terms:
v(x, u, n="L1", bj=TRUE): varying coefficients enter theformulaasv(x,u)whereudenotes the categorical effect modifier andxthe modfied covariate.
A varying intercept is denoted byv(1,u). Varying coefficients with categorical effect modifiers are penalized as described in Oelker et. al. 2012.
The argumentbjand the elementphiin argumenttuningallow for the described weights.p(u, n="L1"): ordinal/nominal covariatesugiven asp(u)are penalized as described in Gertheiss and Tutz (2010). For numeric covariates,p(u)indicates a pure Lasso penalty.grouped(u, ...): penalizes a group of covariates with the grouped Lasso penalty of Yuan and Lin (2006); so far, working for categorical covariates onlysp(x, knots=20, n="L2"): implents a continuousxcovariate non-parametrically as$f(x)$;$f(x)$is represented by centered evaluations of basis functions (cubic B-splines with number of knots =knots); forn="L2", the curvature of$f(x)$is penalized by a Ridge penalty; see Eilers and Marx (1996)SCAD(u): penalizes a covariateuwith the SCAD penalty by Fan and Li (2001); for categorical covariatesu, differences of coefficients are penalized by a SCAD penalty, see Gertheiss and Tutz (2010)elastic(u): penalizes a covariateuwith the elastic net penalty by Zou and Hastie (2005); for categorical covariatesu, differences of coefficients are penalized by the elastic net penalty, see Gertheiss and Tutz (2010)formula contains no (varying) intercept, gvcm.cat assumes a constant intercept. There is no way to avoid an intercept.
For specials p and v, there is the special argument n:
if n="L1", the absolute values in the penalty are replaced by squares of the same terms;
if n="L2", the absolute values in the penalty are replaced by quadratic, Ridge-type terms;
if n="L0", the absolute values in the penalty are replaced by an indicator for non-zero entries of the same terms.
For methods "AIC" and "BIC", the coefficients are not penalized but selected by a forward selection strategy whenever it makes sense;
for special v(x,u), the selection strategy is described in Oelker et. al. 2012; the approach for the other specials corresponds to this idea.
For binomial families the response can also be a success/failure rate or a two-column matrix with the columns giving the numbers of successes and failures.
Function pest computes penalized estimates, that is, it implements method "lqa" (PIRLS-algorithm).
Function abc implements the forward selection strategy employing AIC/BIC.
Categorical effect modifiers and penalized categorical covariates are dummy coded as required by the penalty. If x in v(x,u) is binary, it is effect coded (first category refers to -1). Other covariates are coded like given by getOption.
There is a summary function: summary.gvcm.catindex, cat_control, plot.gvcm.cat, predict.gvcm.cat, simulation## example for function simulation()
covariates <- list(x1=list("unif", c(0,2)),
x2=list("unif", c(0,2)),
x3=list("unif", c(0,2)),
u=list("multinom",c(0.3,0.4,0.3), "nominal")
)
true.f <- y ~ 1 + v(x1,u) + x2
true.coefs <- c(0.2, 0.3,.7,.7, -.5)
data <- simulation(400, covariates, NULL, true.f, true.coefs , binomial(), seed=456)
## example for function gvcm.cat()
f <- y ~ v(1,u) + v(x1,u) + v(x2,u)
m1 <- gvcm.cat(f, data, binomial(), plot=TRUE, control=cat_control(lambda.upper=19))
summary(m1)
## example for function predict.gvcm.cat
newdata <- simulation(200, covariates, NULL, true.f, true.coefs , binomial(), seed=789)
prediction <- predict.gvcm.cat(m1, newdata)
## example for function plot.gvcm.cat
plot(m1)
plot(m1, type="score")
plot(m1, type="coefs")Run the code above in your browser using DataLab