cold: Fit of parametric models via likelihood method

Description

Performs the fit of parametric models via likelihood method. Serial dependence and two random effects are allowed according to the stochastic model chosen. Missing values are automatically accounted for computing the likelihood function.

Usage

cold(formula, random, data, id="id", time="time",subSET, 
dependence ="ind", start=NULL, method="BFGS", integration="QUADPACK", 
M="6000", control=coldControl(), integrate=coldIntegrate(), 
cublim=coldcublim(), trace=FALSE)

Arguments

formula

a description of the model to be fitted of the form response~predictors.

random

the predictos that includes random effects of the form response~predictors.

data

a data frame containing the variables in the formula. NA values are allowed. If data is missing, an error message is produced. See "Details".

a string that matches the name of the id variable in data, i.e., the subject identification variable. By default, the program expects a variable named id to be present in the data.frame, otherwise the name of the variable playing the role of id must be declared by assigning id here.

time

a string that matches the name of the time variable in data. By default, the program expects a variable named time to be present in the data.frame, otherwise the name of the variable playing the role of time must be declared by assigning time here.

subSET

an optional expression indicating the subset of the rows of data that should be used in the fit. All observations are included by default.

dependence

expression stating which dependence structure should be used in the fit. The default is "ind". According to the stochastic model chosen serial dependence and random effects are allowed. There are six options: "ind" (independence), "AR1" (first order autoregressive), "indR" (independence with random intercept), "AR1R" (first order autoregressive with random intercept), "indR2" (independence with two random effects) or "AR1R2" (first order autoregressive with two random effects).

start

a vector of initial values for the nuisance parameters of the likelihood. The dimension of the vector is according to the structure of the dependence model.

method

The method used in the optimization process: "BFGS", "CG", "L-BFGS-B" and "SANN". The default is "BFGS". See optim for details.

integration

The integration code allows the user choose the integration method to solve the integrals: "QUADPACK" (fortran routines, only for random intercept models), "cubature" (uses cubature package to compute integrals when the dependence structure includes one or two random effects), "MC" (uses Monte Carlo methods to compute integrals when the dependence structure includes one or two random effects). The default is "QUADPACK".

Number of random points considered to evaluate the integral when the user choose Monte Carlo methods ("integration=MC"). The default is set to 6000.

control

a list of algorithmic constants for the optimizer optim. See R documentation of optim.control for details and possible control options. By default, cold sets the maximum number of iterations (maxit) equal to 100, the absolute convergence tolerance (abstol) and the relative convergence tolerance (rel.tol) equal to 1e-6 and uses the optim standard default values for the remaining options.

integrate

a list of algorithmic constants for the computation of a definite integral using a Fortran-77 subroutine. See "Details".

cublim

a list of algorithmic constants for the computation of a definite integral when the integration argument is set to cubature.

trace

logical flag: if TRUE, details of the nonlinear optimization are printed. By default the flag is set to FALSE.

Value

An object of class cold.

Background

Assume that each subject of a given set has been observed at number of successive time points. For each subject and for each time point, a count response variable, and a set of covariates are recorded.

Individual random effects, \(b_0\), can be incorporated in the form of a random intercept term of the linear predictor of the logarithmic regression, assuming a normal distribution of mean 0 and variance \(\sigma^2\), parameterized as \(\omega=\log(\sigma^2)\). The combination of serial Markov dependence with a random intercept corresponds here to the dependence structures indR, AR1R.

Two dimensional randoms effects can also be incorporated the linear predictor of the logarithmic regression. Consider a two-dimensional vector of random effects \(b=(b_0,b_1)\) where we assumed to be a random sample from the bivariate normal distribution, \(b\sim N(0,D)\) with \(var(b_0)=\sigma^2_{b_0}\), \(var(b_1)=\sigma^2_{b_1}\) and \(cov(b_0,b_1)=0\).

The combination of serial Markov dependence with two random effects corresponds here to the dependence structures indR2, AR1R2.

Original sources of the above formulation are given by Azzalini (1994), as for the AR1, and by Gon<U+00E7>alves (2002) and Gon<U+00E7>alves and Azzalini (2008) for the its extensions.

Details

data are contained in a data.frame. Each element of the data argument must be identifiable by a name. The simplest situation occurs when all subjects are observed at the same time points. If there are missing values in the response variable NA values must be inserted. The response variable represent the individual profiles of each subject, it is expected a variable in the data.frame that identifies the correspondence of each component of the response variable to the subject that it belongs, by default is named id variable. It is expected a variable named time to be present in the data.frame. If the time component has been given a different name, this should be declared. The time variable should identify the time points that each individual profile has been observed.

subSET is an optional expression indicating the subset of data that should be used in the fit. This is a logical statement of the type variable 1 == "a" & variable 2 > x which identifies the observations to be selected. All observations are included by default.

For the models with random intercept indR and AR1R, by default cold compute integrals based on a Fortran-77 subroutine package QUADPACK. For some data sets, when the dependence structure has a random intercept term, the user could have the need to do a specification of the integrate argument list changing the integration limits in the coldIntegrate function. The coldIntegrate is an auxiliary function for controlling cold fitting. There are more two options to fit models with a random intercept by setting integration="cubature" or integration="MC". For the models with two random effects indR2 and AR1R2, the user has two define the integration method by setting integration="cubature" or integration="MC". The second random effect is considered to be included in the time argument that plays the role of the time variable in the data.frame. For the two random effects models we have a random intercept and a random slope.

References

Azzalini, A. (1994). Logistic regression and other discrete data models for serially correlated observations. J. Ital. Stat. Society, 3 (2), 169-179. 10.1007/bf02589225.

Gon<U+00E7>alves, M. Helena (2002). Likelihood methods for discrete longitudinal data. PhD thesis, Faculty of Sciences, University of Lisbon.

Gon<U+00E7>alves, M. Helena, Cabral, M. Salom<U+00E9>, Ruiz de Villa, M. Carme, Escrich, Eduardo and Solanas, Montse. (2007). Likelihood approach for count data in longitudinal experiments. Computational statistics and Data Analysis, 51, 12, 6511-6520. 10.1016/j.csda.2007.03.002.

Gon<U+00E7>alves, M. Helena and Cabral, M. Salom<U+00E9>. (2021). cold: An R Package for the Analysis of Count Longitudinal Data. Journal of Statistical Software, 99, 3, 1--24. 10.18637/jss.v099.i03.

Examples

Run this code

# NOT RUN {
#####  data = seizure
str(seizure)

### AR1
seiz1M <- cold(y ~ lage + lbase + v4 + trt + trt:lbase, data = seizure, 
start = NULL, dependence = "AR1")

summary(seiz1M)
getAIC(seiz1M)
getLogLik(seiz1M)

### independence
seiz0M <- cold(y ~ lage + lbase + v4 + trt + trt:lbase, data = seizure, 
start = NULL, dependence = "ind")

summary(seiz0M)
getAIC(seiz0M)
getLogLik(seiz0M)


#####  data= datacold
str(datacold)

### AR1R with the default integration method
mod1R <- cold(z ~ Time * Treatment, random = ~ 1, data = datacold, 
time = "Time", id = "Subject", dependence = "AR1R")

summary (mod1R)
vareff(mod1R)
randeff(mod1R)

### AR1R with integration="cubature"
# }
# NOT RUN {
mod1R.c <- cold(z ~ Time*Treatment, random = ~ 1, data = datacold, 
time = "Time", id = "Subject", dependence = "AR1R", integration = "cubature")
summary (mod1R.c)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab