A function to fit penalized generalized estimating equation model. This function was re-wrote partly with RCPP and RCPPEigen for better computation efficiency.
PGEE(
formula,
id,
data,
na.action = NULL,
family = gaussian(link = "identity"),
corstr = "independence",
Mv = NULL,
beta_int = NULL,
R = NULL,
scale.fix = TRUE,
scale.value = 1,
lambda,
pindex = NULL,
eps = 10^-6,
maxiter = 30,
tol = 10^-3,
silent = TRUE
)
a PGEE object, which includes: fitted coefficients - the fitted single index coefficients with unit norm and first component being non negative
A formula expression response ~ predictors
;
A vector for identifying subjects/clusters.
A data frame which stores the variables in formula
with id
variable.
A function to remove missing values from the data. Only na.omit
is allowed here.
A family
object: a list of functions and expressions for defining link
and
variance
functions. Families supported in PGEE
are binomial
, gaussian
, gamma
and
poisson
. The links
, which are not available in gee
, is not available here. The default family
is gaussian
.
A character string, which specifies the correlation of correlation structure.
Structures supported in PGEE
are "AR-1"
,"exchangeable"
, "fixed"
, "independence"
,
"stat_M_dep"
,"non_stat_M_dep"
, and "unstructured"
. The default corstr
correlation is
"independence"
.
If either "stat_M_dep"
, or "non_stat_M_dep"
is specified in corstr
, then this
assigns a numeric value for Mv
. Otherwise, the default value is NULL
.
User specified initial values for regression parameters. The default value is NULL
.
If corstr = "fixed"
is specified, then R
is a square matrix of dimension maximum cluster
size containing the user specified correlation. Otherwise, the default value is NULL
.
A logical variable; if true, the scale parameter is fixed at the value of scale.value
.
The default value is TRUE
.
If scale.fix = TRUE
, this assigns a numeric value to which the scale parameter should be
fixed. The default value is 1.
A numerical value for the penalization parameter of the scad function, which is estimated via cross-validation.
An index vector showing the parameters which are not subject to penalization. The default value
is NULL
. However, in case of a model with intercept, the intercept parameter should be never penalized.
A numerical value for the epsilon used in minorization-maximization algorithm. The default value is
10^-6
.
The number of iterations that is used in the estimation algorithm. The default value is 25
.
The tolerance level that is used in the estimation algorithm. The default value is 10^-3
.
A logical variable; if false, the regression parameter estimates at each iteration are
printed. The default value is TRUE
.
# generate data
set.seed(2021)
sim_data <- generate_data(
nsub = 100, nobs = rep(10, 100), p = 100,
beta0 = c(rep(1, 7), rep(0, 93)), rho = 0.6, corstr = "AR1",
dis = "normal", ka = 1
)
PGEE_fit <- PGEE("y ~.-id-1", id = id, data = sim_data,
corstr = "exchangeable", lambda = 0.01)
PGEE_fit$coefficients
Run the code above in your browser using DataLab