gcmr: Fitting Gaussian Copula Marginal Regression Models by Maximum (Simulated) Likelihood.

Description

Fits Gaussian copula marginal regression models by maximum (simulated) likelihood.

Usage

gcmr(formula, data, subset, offset, contrasts=NULL, 
     marginal, cormat, start, fixed, options=gcmr.options())
gcmr.fit(x=rep(1,NROW(y)), y, z=NULL, offset=NULL, 
	 marginal, cormat, start, fixed, options=gcmr.options())

Arguments

formula

a symbolic description of the model to be fitted of type y ~ x or y ~ x | z, for details see below.

data

an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken fr

subset

an optional vector specifying a subset of observations to be used in the fitting process.

offset

optional numeric vector with an a priori known component to be included in the linear predictor for the mean. When appropriate, offset may also be a list of two offsets for the mean and precision equation, respectively.

contrasts

an optional list. See the contrasts.arg of model.matrix.default.

design matrix of dimension n * p.

vector of observations of length n.

optional design matrix for the dispersion/shape of dimension n * p.

marginal

an object of class marginal.gcmr specifying the marginal part of the model.

cormat

an object of class cormat.gcmr representing the correlation matrix of the errors.

start

optional numeric vector with starting values for the model parameters.

fixed

optional numeric vector of the same length as the total number of parameters. If supplied, only NA entries in fixed will be varied.

options

list of options passed to function gcmr.options.

Value

An object of class "gcmr" with the following components:
estimatethe vector of parameter estimates.
maximumthe maximum (simulated) likelihood.
hessian(minus) the Hessian at the maximum likelihood estimate.
jacthe Jacobian at the maximum likelihood estimate.
ythe y vector used.
xthe model matrix used for the mean response.
zthe (optional) model matrix used for the dispersion/shape.
offsetthe offset used.
nthe number of observations.
callthe matched call.
not.nathe vector of binary indicators of missing observations.
marginalthe marginal model used.
cormatthe correlation matrix used.
fixedthe numeric vector indicating which parameters are constants.
ibetathe indices of marginal parameters.
igammathe indices of dependence parameters.
optionsthe fitting options used, see gcmr.options.
Functions coefficients, logLik, vcov.gcmr, se and residuals.gcmr can be used to extract various useful features of the value returned by gcmr.

Details

Gaussian copula marginal regression models (Song, 2000; Masarotto and Varin, 2012) provide a flexible general framework for modelling dependent responses of any type. Gaussian copulas combine the simplicity of interpretation in marginal modelling with the flexibility in the specification of the dependence structure in multivariate normal models.

This package contains Rfunctions related to the paper Masarotto and Varin (2012). The main function is gcmr that fits Gaussian copula marginal regression models. Inference is performed through a likelihood approach. Computation of the exact likelihood is possible only for continuous responses, otherwise the likelihood function is approximated by importance sampling. See Masarotto and Varin (2012) for details.

Standard formula y ~ x1 + x2 indicates that the mean response is modelled as a function of covariates x1 and x2 through an appropriate link function. Extended formula y ~ x1 + x2 | z1 + z2 indicates that the dispersion (or the shape) parameter of the marginal distribution is modelled as a function of covariates z1 and z2. Dispersion (or shape) parameters are always modelled on logarithm scale. Covariates for mean and dispersion can be overlapping.

For binomial marginals specified by binomial.marg the response is specified as a factor when the first level denotes failure and all others success or as a two-column matrix with the columns giving the numbers of successes and failures.

gcmr.fit is the workhorse function: it is not normally called directly but can be more efficient where the response vector and design matrix have already been calculated.

References

Masarotto, G. and Varin, C. (2012). Gaussian copula marginal regression. Electronic Journal of Statistics, 6, 1517--1549. http://projecteuclid.org/euclid.ejs/1346421603.

Song, P. X.-K. (2000). Multivariate dispersion models generated from Gaussian copula. Scandinavian Journal of Statistics 27, 305--320.

Examples

Run this code

## Warning: Likelihood approximated using only a limited number 
## of Monte Carlo replications.
## Polio data. 
## Marginal negative binomial model with ARMA(2,1) correlation matrix.
data( polio )
names( polio )
gcmr( y ~ . , data = polio, marginal = negbin.marg, cormat = arma.cormat( 2, 1 ), 
options = list( seed = 71271, nrep = 100 ) )
## Scotland lip cancer data. 
## Marginal negative binomial model with Matern correlation matrix.
data(scotland)
D.scotland <- spDists( cbind( scotland$longitude, scotland$latitude ), longlat = TRUE )
gcmr( observed ~ offset( log( expected ) ) + AFF + I( latitude / 100 ), data = scotland, 
marginal = negbin.marg, cormat = matern.cormat( D.scotland ), options = 
list( seed = 71271, nrep = 100 ) )
## Monthly Deaths from Lung Diseases in the UK.
## Marginal Gamma model with ARMA(1,0) correlation matrix 
sinTerm <- sin( 2*pi*time( ldeaths ) )
cosTerm <- cos( 2*pi*time( ldeaths ) )
trend <- scale( time( ldeaths ) )
gcmr( ldeaths ~ trend + sinTerm + cosTerm,  marginal = Gamma.marg( link = "log" ), 
cormat = arma.cormat( p = 1 ) )
## now with dispersion modelling
gcmr( ldeaths ~ trend + sinTerm + cosTerm | trend + sinTerm + cosTerm, marginal = 
Gamma.marg( link = "log" ), cormat = arma.cormat( p = 1 ) )
## Prater's Petrol Refinery Data
## Beta regression with exchangeable within clustered correlation
gcmr( I( Y/100 ) ~ SG + VP + V10 + EP, marginal = beta.marg, cormat = cluster.cormat( 
id = as.numeric(No), type = "exchangeable" ), data = petrol )		
## (results suggest no evidence of within-cluster correlation)

Run the code above in your browser using DataLab