lpeer: Longitudinal Functional Models with Structured Penalties

Description

Implements longitudinal functional model with structured penalties (Kundu et al., 2012) with scalar outcome, single functional predictor, one or more scalar covariates and subject-specific random intercepts through mixed model equivalence.

Usage

lpeer(
  Y,
  subj,
  t,
  funcs,
  argvals = NULL,
  covariates = NULL,
  comm.pen = TRUE,
  pentype = "Ridge",
  L.user = NULL,
  f_t = NULL,
  Q = NULL,
  phia = 10^3,
  se = FALSE,
  ...
)

Arguments

vector of all outcomes over all visits or timepoints

subj

vector containing the subject number for each observation

vector containing the time information when the observation are taken

funcs

matrix containing observed functional predictors as rows. Rows with NA and Inf values will be deleted.

argvals

matrix (or vector) of indices of evaluations of \(X_i(t)\); i.e. a matrix with ith row \((t_{i1},.,t_{iJ})\)

covariates

matrix of scalar covariates.

comm.pen

logical value indicating whether common penalty for all the components of regression function. Default is TRUE.

pentype

type of penalty: either decomposition based penalty ('DECOMP') or ridge ('RIDGE') or second-order difference penalty ('D2') or any user defined penalty ('USER'). For decomposition based penalty user need to specify Q matrix in Q argument (see details). For user defined penalty user need to specify L matrix in L argument (see details). For Ridge and second-order difference penalty, specification for arguments L and Q will be ignored. Default is 'RIDGE'.

L.user

penalty matrix. Need to be specified with pentype='USER'. When comm.pen=TRUE, Number of columns need to be equal with number of columns of matrix specified to funcs. When comm.pen=FALSE, Number of columns need to be equal with the number of columns of matrix specified to funcs times the number of components of regression function. Each row represents a constraint on functional predictor. This argument will be ignored when value of pentype is other than 'USER'.

f_t

vector or matrix with number of rows equal to number of total observations and number of columns equal to d (see details). If matrix then each column pertains to single function of time and the value in the column represents the realization corresponding to time vector t. The column with intercept or multiple of intercept will be dropped. A NULL value refers to time-invariant regression function. Default value is NULL.

Q matrix to derive decomposition based penalty. Need to be specified with pentype='DECOMP'. When comm.pen=TRUE, number of columns must equal number of columns of matrix specified to funcs. When comm.pen=FALSE, Number of columns need to be equal with the number of columns of matrix specified to funcs times the number of components of regression function. Each row represents a basis function where functional predictor is expected lie according to prior belief. This argument will be ignored when value of pentype is other than 'DECOMP'.

phia

scalar value of a in decomposition based penalty. Needs to be specified with pentype='DECOMP'.

logical; calculate standard error when TRUE.

...

additional arguments passed to lme.

Value

A list containing:

fit

result of the call to lme

fitted.vals

predicted outcomes

BetaHat

parameter estimates for scalar covariates including intercept

se.Beta

standard error of parameter estimates for scalar covariates including intercept

Beta

parameter estimates with standard error for scalar covariates including intercept

GammaHat

estimates of components of regression functions. Each column represents one component function.

Se.Gamma

standard error associated with GammaHat

AIC

AIC value of fit (smaller is better)

BIC

BIC value of fit (smaller is better)

logLik

(restricted) log-likelihood at convergence

lambda

list of estimated smoothing parameters associated with each component function

conditional variance of Y treating only random intercept as random one.

unconditional variance of Y

number of subjects

number of Sampling points in functional predictor

TotalObs

total number of observations over all subjects

Sigma.u

estimated sd of random intercept.

sigma

estimated within-group error standard deviation.

Details

If there are any missing or infinite values in Y, subj, t, covariates, funcs and f_t, the corresponding row (or observation) will be dropped, and infinite values are not allowed for these arguments. Neither Q nor L may contain missing or infinite values. lpeer() fits the following model:

\(y_{i(t)}=X_{i(t)}^T \beta+\int {W_{i(t)}(s)\gamma(t,s) ds} +Z_{i(t)}u_i + \epsilon_{i(t)}\)

where \(\epsilon_{i(t)} ~ N(0,\sigma ^2)\) and \(u_i ~ N(0, \sigma_u^2)\). For all the observations, predictor function \(W_{i(t)}(s)\) is evaluated at K sampling points. Here, regression function \(\gamma (t,s)\) is represented in terms of (d+1) component functions \(\gamma_0(s)\),..., \(\gamma_d(s)\) as follows

\(\gamma (t,s)= \gamma_0(s)+f_1(t) \gamma_1(s) + f_d(t) \gamma_d(s)\)

Values of \(y_{i(t)} , X_{i(t)}\) and \(W_{i(t)}(s)\) are passed through argument Y, covariates and funcs, respectively. Number of elements or rows in Y, t, subj, covariates (if not NULL) and funcs need to be equal.

Values of \(f_1(t),...,f_d(t)\) are passed through f_t argument. The matrix passed through f_t argument should have d columns where each column represents one and only one of \(f_1(t),..., f_d(t)\).

The estimate of (d+1) component functions \(\gamma_0(s)\),..., \(\gamma_d(s)\) is obtained as penalized estimated. The following 3 types of penalties can be used for a component function:

i. Ridge: \(I_K\)

ii. Second-order difference: [\(d_{i,j}\)] with \(d_{i,i} = d_{i,i+2} = 1, d_{i,i+1} = -2\), otherwise \(d_{i,j} =0\)

iii. Decomposition based penalty: \(bP_Q+a(I-P_Q)\) where \(P_Q= Q^T (QQ^T)^{-1}Q\)

For Decomposition based penalty the user must specify pentype= 'DECOMP' and the associated Q matrix must be passed through the Q argument. Alternatively, one can directly specify the penalty matrix by setting pentype= 'USER' and using the L argument to supply the associated L matrix.

If Q (or L) matrix is similar for all the component functions then argument comm.pen should have value TRUE and in that case specified matrix to argument Q (or L) should have K columns. When Q (or L) matrix is different for all the component functions then argument comm.pen should have value FALSE and in that case specified matrix to argument Q (or L) should have K(d+1) columns. Here first K columns pertains to first component function, second K columns pertains to second component functions, and so on.

Default penalty is Ridge penalty for all the component functions and user needs to specify 'RIDGE'. For second-order difference penalty, user needs to specify 'D2'. When pentype is 'RIDGE' or 'D2' the value of comm.pen is always TRUE and comm.pen=FALSE will be ignored.

References

Kundu, M. G., Harezlak, J., and Randolph, T. W. (2012). Longitudinal functional models with structured penalties (arXiv:1211.4763 [stat.AP]).

Randolph, T. W., Harezlak, J, and Feng, Z. (2012). Structured penalties for functional linear models - partially empirical eigenvectors for regression. Electronic Journal of Statistics, 6, 323--353.

Examples

Run this code

# NOT RUN {
# }
# NOT RUN {
#------------------------------------------------------------------------
# Example 1: Estimation with Ridge penalty
#------------------------------------------------------------------------

##Load Data
data(DTI)

## Extract values for arguments for lpeer() from given data
cca = DTI$cca[which(DTI$case == 1),]
DTI = DTI[which(DTI$case == 1),]

##1.1 Fit the model with single component function
##    gamma(t,s)=gamm0(s)
t<- DTI$visit
fit.cca.lpeer1 = lpeer(Y=DTI$pasat, t=t, subj=DTI$ID, funcs = cca)
plot(fit.cca.lpeer1)

##1.2 Fit the model with two component function
##    gamma(t,s)=gamm0(s) + t*gamma1(s)
fit.cca.lpeer2 = lpeer(Y=DTI$pasat, t=t, subj=DTI$ID, funcs = cca,
                      f_t=t, se=TRUE)
plot(fit.cca.lpeer2)

#------------------------------------------------------------------------
# Example 2: Estimation with structured penalty (need structural
#            information about regression function or predictor function)
#------------------------------------------------------------------------

##Load Data
data(PEER.Sim)

## Extract values for arguments for lpeer() from given data
K<- 100
W<- PEER.Sim[,c(3:(K+2))]
Y<- PEER.Sim[,K+3]
t<- PEER.Sim[,2]
id<- PEER.Sim[,1]

##Load Q matrix containing structural information
data(Q)

##2.1 Fit the model with two component function
##    gamma(t,s)=gamm0(s) + t*gamma1(s)
Fit1<- lpeer(Y=Y, subj=id, t=t, covariates=cbind(t), funcs=W,
	    pentype='DECOMP', f_t=cbind(1,t), Q=Q, se=TRUE)

Fit1$Beta
plot(Fit1)

##2.2 Fit the model with three component function
##    gamma(t,s)=gamm0(s) + t*gamma1(s) + t^2*gamma1(s)
Fit2<- lpeer(Y=Y, subj=id, t=t, covariates=cbind(t), funcs=W,
		     pentype='DECOMP', f_t=cbind(1,t, t^2), Q=Q, se=TRUE)

Fit2$Beta
plot(Fit2)

##2.3 Fit the model with two component function with different penalties
##    gamma(t,s)=gamm0(s) + t*gamma1(s)
Q1<- cbind(Q, Q)
Fit3<- lpeer(Y=Y, subj=id, t=t, covariates=cbind(t), comm.pen=FALSE, funcs=W,
		     pentype='DECOMP', f_t=cbind(1,t), Q=Q1, se=TRUE)

##2.4 Fit the model with two component function with user defined penalties
##    gamma(t,s)=gamm0(s) + t*gamma1(s)
phia<- 10^3
P_Q <- t(Q)%*%solve(Q%*%t(Q))%*% Q
L<- phia*(diag(K)- P_Q) + 1*P_Q
Fit4<- lpeer(Y=Y, subj=id, t=t, covariates=cbind(t), funcs=W,
		     pentype='USER', f_t=cbind(1,t), L=L, se=TRUE)

L1<- adiag(L, L)
Fit5<- lpeer(Y=Y, subj=id, t=t, covariates=cbind(t), comm.pen=FALSE, funcs=W,
		     pentype='USER', f_t=cbind(1,t), L=L1, se=TRUE)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab