Learn R Programming

glmmrBase (version 0.4.6)

Model: A GLMM Model

Description

A GLMM Model

A GLMM Model

Arguments

Public fields

covariance

A Covariance object defining the random effects covariance.

mean

A MeanFunction object, defining the mean function for the model, including the data and covariate design matrix X.

family

One of the family function used in R's glm functions. See family for details

weights

A vector indicting the weights for the observations.

trials

For binomial family models, the number of trials for each observation. The default is 1 (bernoulli).

formula

The formula for the model. May be empty if separate formulae are specified for the mean and covariance components.

var_par

Scale parameter required for some distributions (Gaussian, Gamma, Beta).

mcmc_options

There are five options for MCMC sampling that are specified in this list:

  • warmup The number of warmup iterations. Note that if using the internal HMC sampler, this only applies to the first iteration of the MCML algorithm, as the values from the previous iteration are carried over.

  • samps The number of MCMC samples drawn in the MCML algorithm. For smaller tolerance values larger numbers of samples are required. For the internal HMC sampler, larger numbers of samples are generally required than if using Stan since the samples generally exhibit higher autocorrealtion, especially for more complex covariance structures.

  • lambda (Only relevant for the internal HMC sampler) Value of the trajectory length of the leapfrog integrator in Hamiltonian Monte Carlo (equal to number of steps times the step length). Larger values result in lower correlation in samples, but require larger numbers of steps and so is slower. Smaller numbers are likely required for non-linear GLMMs.

  • refresh How frequently to print to console MCMC progress if displaying verbose output.

  • maxsteps (Only relevant for the internal HMC sampler) Integer. The maximum number of steps of the leapfrom integrator

Methods


Method use_attenuation()

Sets the model to use or not use "attenuation" when calculating the first-order approximation to the covariance matrix.

Usage

Model$use_attenuation(use)

Arguments

use

Logical indicating whether to use "attenuation".

Returns

None. Used for effects.


Method fitted()

Return fitted values. Does not account for the random effects. For simulated values based on resampling random effects, see sim_data(). To predict the values at a new location see predict().

Usage

Model$fitted(type = "link", X, u)

Arguments

type

One of either "link" for values on the scale of the link function, or "response" for values on the scale of the response

X

(Optional) Fixed effects matrix to generate fitted values

u

(Optional) Random effects values at which to generate fitted values

Returns

A Matrix class object containing the predicted values


Method predict()

Generate predictions at new values

Generates predicted values using a new data set to specify covariance values and values for the variables that define the covariance function. The function will return a list with the linear predictor, conditional distribution of the new random effects term conditional on the current estimates of the random effects, and some simulated values of the random effects if requested.

Usage

Model$predict(newdata, offset = rep(0, nrow(newdata)), m = 0)

Arguments

newdata

A data frame specifying the new data at which to generate predictions

offset

Optional vector of offset values for the new data

m

Number of samples of the random effects to draw

Returns

A list with the linear predictor, parameters (mean and covariance matrices) for the conditional distribution of the random effects, and any random effect samples.


Method new()

Create a new Model object

Usage

Model$new(
  formula,
  covariance,
  mean,
  data = NULL,
  family = NULL,
  var_par = NULL,
  offset = NULL,
  weights = NULL,
  trials = NULL,
  verbose = TRUE
)

Arguments

formula

An optional model formula containing fixed and random effect terms. If not specified, then seaparate formulae need to be provided to the covariance and mean arguments below.

covariance

Either a Covariance object, or an equivalent list of arguments that can be passed to Covariance to create a new object. At a minimum the list must specify a formula. If parameters are not included then they are initialised to 0.5.

mean

Either a MeanFunction object, or an equivalent list of arguments that can be passed to MeanFunction to create a new object. At a minimum the list must specify a formula. If parameters are not included then they are initialised to 0.

data

A data frame with the data required for the mean function and covariance objects. This argument can be ignored if data are provided to the covariance or mean arguments either via Covariance and MeanFunction object, or as a member of the list of arguments to both covariance and mean.

family

A family object expressing the distribution and link function of the model, see family. This argument is optional if the family is provided either via a MeanFunction or MeanFunction objects, or as members of the list of arguments to mean. Current accepts binomial, gaussian, Gamma, poisson, and Beta.

var_par

Scale parameter required for some distributions, including Gaussian. Default is NULL.

offset

A vector of offset values. Optional - could be provided to the argument to mean instead.

weights

A vector of weights. Optional..

trials

For binomial family models, the number of trials for each observation. If it is not set, then it will default to 1 (a bernoulli model).

verbose

Logical indicating whether to provide detailed output

Returns

A new Model class object

Examples

\dontshow{
setParallel(FALSE) # for the CRAN check
}
#create a data frame describing a cross-sectional parallel cluster
#randomised trial
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
mod <- Model$new(
  formula = ~ factor(t) + int - 1 + (1|gr(cl)) + (1|gr(cl,t)),
  data = df,
  family = stats::gaussian()
)

#here we will specify a cohort study and provide parameter values df <- nelder(~ind(20) * t(6)) df$int <- 0 df[df$t > 3, 'int'] <- 1

des <- Model$new( covariance = list( formula = ~ (1|gr(ind)), parameters = c(0.05)), mean = list( formula = ~ int, parameters = c(1,0.5)), data = df, family = stats::poisson())

# or as des <- Model$new( formula = ~ int + (1|gr(ind)), covariance = list(parameters = c(0.05)), mean = list(parameters = c(1,0.5)), data = df, family = stats::poisson() )

#an example of a spatial grid with two time points df <- nelder(~ (x(10)*y(10))*t(2)) spt_design <- Model$new(covariance = list( formula = ~(1|ar0(t)*fexp(x,y))), mean = list(formula = ~ 1), data = df, family = stats::gaussian())


Method print()

Print method for Model class

Usage

Model$print()

Arguments

...

ignored


Method n()

Returns the number of observations in the model

Usage

Model$n(...)

Arguments

...

ignored


Method subset_rows()

Subsets the design keeping specified observations only

Given a vector of row indices, the corresponding rows will be kept and the other rows will be removed from the mean function and covariance

Usage

Model$subset_rows(index)

Arguments

index

Integer or vector integers listing the rows to keep

Returns

The function updates the object and nothing is returned


Method subset_cols()

Subsets the columns of the design

Removes the specified columns from the linked mean function object's X matrix.

Usage

Model$subset_cols(index)

Arguments

index

Integer or vector of integers specifying the indexes of the columns to keep

Returns

The function updates the object and nothing is returned


Method sim_data()

Generates a realisation of the design

Generates a single vector of outcome data based upon the specified GLMM design.

Usage

Model$sim_data(type = "y")

Arguments

type

Either 'y' to return just the outcome data, 'data' to return a data frame with the simulated outcome data alongside the model data, or 'all', which will return a list with simulated outcomes y, matrices X and Z, parameters beta, and the values of the simulated random effects.

Returns

Either a vector, a data frame, or a list

Examples

df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
\dontshow{
setParallel(FALSE) # for the CRAN check
}
des <- Model$new(
  covariance = list(
    formula = ~ (1|gr(cl)*ar0(t)),
    parameters = c(0.05,0.8)),
  mean = list(
    formula = ~ factor(t) + int - 1,
    parameters = c(rep(0,5),0.6)),
  data = df,
  family = stats::binomial()
)
ysim <- des$sim_data()


Method check()

Checks for any changes in linked objects and updates.

Checks for any changes in any object and updates all linked objects if any are detected. Generally called automatically and not required by the user.

Usage

Model$check(verbose = TRUE)

Arguments

verbose

Logical indicating whether to report if any updates are made, defaults to TRUE

Returns

Linked objects are updated by nothing is returned

Examples

\dontshow{
setParallel(FALSE) # for the CRAN check
}
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Model$new(
  covariance = list(
    formula = ~ (1|gr(cl)*ar0(t)),
    parameters = c(0.05,0.8)),
  mean = list(
    formula = ~ factor(t) + int - 1,
    parameters = c(rep(0,5),0.6)),
  data = df,
  family = stats::binomial()
)
des$check() #does nothing
des$covariance$parameters <- c(0.1,0.9)
des$check() #updates
des$mean$parameters <- c(rnorm(5),0.1)
des$check() #updates


Method update_parameters()

Updates the parameters of the mean function and/or the covariance function

Usage

Model$update_parameters(mean.pars = NULL, cov.pars = NULL, var.par = NULL)

Arguments

mean.pars

(Optional) Vector of new mean function parameters

cov.pars

(Optional) Vector of new covariance function(s) parameters

var.par

(Optional) A scalar value for var_par

verbose

Logical indicating whether to provide more detailed feedback

Examples

\dontshow{
setParallel(FALSE) # for the CRAN check
}
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Model$new(
  covariance = list(
    formula = ~ (1|gr(cl)*ar0(t))),
  mean = list(
    formula = ~ factor(t) + int - 1),
  data = df,
  family = stats::binomial()
)
des$update_parameters(cov.pars = c(0.1,0.9))


Method information_matrix()

Generates the information matrix of the GLS estimator

Usage

Model$information_matrix(include.re = FALSE)

Arguments

include.re

logical indicating whether to return the information matrix including the random effects components (TRUE), or the GLS information matrix for beta only.

Returns

A PxP matrix


Method sandwich()

Returns the robust sandwich variance-covariance matrix for the fixed effect parameters

Usage

Model$sandwich()

Returns

A PxP matrix


Method kenward_roger()

Returns the bias-corrected variance-covariance matrix for the fixed effect parameters.

Usage

Model$kenward_roger()

Returns

A PxP matrix


Method power()

Estimates the power of the design described by the model using the square root of the relevant element of the GLS variance matrix:

$$(X^T\Sigma^{-1}X)^{-1}$$

Note that this is equivalent to using the "design effect" for many models.

Usage

Model$power(alpha = 0.05, two.sided = TRUE, alternative = "pos")

Arguments

alpha

Numeric between zero and one indicating the type I error rate. Default of 0.05.

two.sided

Logical indicating whether to use a two sided test

alternative

For a one-sided test whether the alternative hypothesis is that the parameter is positive "pos" or negative "neg"

Returns

A data frame describing the parameters, their values, expected standard errors and estimated power.

Examples

\dontshow{
setParallel(FALSE) # for the CRAN check
}
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Model$new(
  covariance = list(
    formula = ~ (1|gr(cl)) + (1|gr(cl,t)),
    parameters = c(0.05,0.1)),
  mean = list(
    formula = ~ factor(t) + int - 1,
    parameters = c(rep(0,5),0.6)),
  data = df,
  family = stats::gaussian(),
  var_par = 1
)
des$power() #power of 0.90 for the int parameter


Method w_matrix()

Returns the diagonal of the matrix W used to calculate the covariance matrix approximation

Usage

Model$w_matrix()

Returns

A vector with values of the glm iterated weights


Method dh_deta()

Returns the derivative of the link function with respect to the linear preditor

Usage

Model$dh_deta()

Returns

A vector


Method Sigma()

Returns the (approximate) covariance matrix of y

Returns the covariance matrix Sigma. For non-linear models this is an approximation. See Details.

Usage

Model$Sigma(inverse = FALSE)

Arguments

inverse

Logical indicating whether to provide the covariance matrix or its inverse

Returns

A matrix.


Method MCML()

Markov Chain Monte Carlo Maximum Likelihood model fitting

Usage

Model$MCML(
  y,
  method = "mcnr",
  sim.lik.step = FALSE,
  verbose = TRUE,
  tol = 0.01,
  max.iter = 30,
  se = "gls",
  sparse = FALSE,
  usestan = TRUE,
  se.theta = TRUE
)

Arguments

y

A numeric vector of outcome data

method

The MCML algorithm to use, either mcem or mcnr, see Details. Default is mcem.

sim.lik.step

Logical. Either TRUE (conduct a simulated likelihood step at the end of the algorithm), or FALSE (does not do this step), defaults to FALSE.

verbose

Logical indicating whether to provide detailed output, defaults to TRUE.

tol

Numeric value, tolerance of the MCML algorithm, the maximum difference in parameter estimates between iterations at which to stop the algorithm.

max.iter

Integer. The maximum number of iterations of the MCML algorithm.

se

String. Type of standard error to return. Options are "gls" for GLS standard errors (the default), "robust" for Huber robust sandwich estimator, "kr" for Kenward-Roger bias corrected standard errors, "bw" to use GLS standard errors with a between-within correction to the degrees of freedom, "bwrobust" to use robust standard errors with between-within correction to the degrees of freedom.

sparse

Logical indicating whether to use sparse matrix methods

usestan

Logical whether to use Stan (through the package cmdstanr) for the MCMC sampling. If FALSE then the internal Hamiltonian Monte Carlo sampler will be used instead. We recommend Stan over the internal sampler as it generally produces a larger number of effective samplers per unit time, especially for more complex covariance functions.

se.theta

Logical. Whether to calculate the standard errors for the covariance parameters. This step is a slow part of the calculation, so can be disabled if required in larger models. Has no effect for Kenward-Roger standard errors.

Returns

A mcml object

Examples

\dontrun{
#create example data with six clusters, five time periods, and five people per cluster-period
df <- nelder(~(cl(6)*t(5)) > ind(5))
# parallel trial design intervention indicator
df$int <- 0
df[df$cl > 3, 'int'] <- 1 
# specify parameter values in the call for the data simulation below
des <- Model$new(
  formula= ~ factor(t) + int - 1 +(1|gr(cl)*ar0(t)),
  covariance = list(parameters = c(0.05,0.7)),
  mean = list(parameters = c(rep(0,5),0.2)),
  data = df,
  family = gaussian(),
  var_par = 1
)
ysim <- des$sim_data() # simulate some data from the model
fit1 <- des$MCML(y = ysim,method="mcnr",usestan=FALSE) # don't use Stan
#fits the models using Stan
fit2 <- des$MCML(y = ysim, method="mcnr")
 #adds a simulated likelihood step after the MCEM algorithm
fit3 <- des$MCML(y = ysim, sim.lik.step = TRUE)

# we could use LA to find better starting values fit4 <- des$LA(y=ysim) # the fit parameter values are stored in the internal model class object fit5 <- des$MCML(y = ysim, method="mcnr") # it should converge much more quickly }


Method LA()

Maximum Likelihood model fitting with Laplace Approximation

Usage

Model$LA(
  y,
  start,
  method = "nr",
  verbose = FALSE,
  se = "gls",
  max.iter = 40,
  tol = 1e-04,
  se.theta = TRUE
)

Arguments

y

A numeric vector of outcome data

start

Optional. A numeric vector indicating starting values for the model parameters.

method

String. Either "nloptim" for non-linear optimisation, or "nr" for Newton-Raphson (default) algorithm

verbose

logical indicating whether to provide detailed algorithm feedback (default is TRUE).

se

String. Type of standard error to return. Options are "gls" for GLS standard errors (the default), "robust" for Huber robust sandwich estimator, "kr" for Kenward-Roger bias corrected standard errors, "bw" to use GLS standard errors with a between-within correction to the degrees of freedom, "bwrobust" to use robust standard errors with between-within correction to the degrees of freedom.

max.iter

Maximum number of algorithm iterations, default 20.

tol

Maximum difference between successive iterations at which to terminate the algorithm

se.theta

Logical. Whether to calculate the standard errors for the covariance parameters. This step is a slow part of the calculation, so can be disabled if required in larger models. Has no effect for Kenward-Roger standard errors.

Returns

A mcml object

Examples

\dontshow{
setParallel(FALSE) # for the CRAN check
}
#create example data with six clusters, five time periods, and five people per cluster-period
df <- nelder(~(cl(6)*t(5)) > ind(5))
# parallel trial design intervention indicator
df$int <- 0
df[df$cl > 3, 'int'] <- 1 
# specify parameter values in the call for the data simulation below
des <- Model$new(
  formula = ~ factor(t) + int - 1 + (1|gr(cl)*ar0(t)),
  covariance = list( parameters = c(0.05,0.7)),
  mean = list(parameters = c(rep(0,5),-0.2)),
  data = df,
  family = stats::binomial()
)
ysim <- des$sim_data() # simulate some data from the model
fit1 <- des$LA(y = ysim)


Method sparse()

Set whether to use sparse matrix methods for model calculations and fitting. By default the model does not use sparse matrix methods.

Usage

Model$sparse(sparse = TRUE)

Arguments

sparse

Logical indicating whether to use sparse matrix methods

Returns

None, called for effects


Method mcmc_sample()

Generate an MCMC sample of the random effects

Usage

Model$mcmc_sample(y, usestan = TRUE, verbose = TRUE)

Arguments

y

Numeric vector of outcome data

usestan

Logical whether to use Stan (through the package cmdstanr) for the MCMC sampling. If FALSE then the internal Hamiltonian Monte Carlo sampler will be used instead. We recommend Stan over the internal sampler as it generally produces a larger number of effective samplers per unit time, especially for more complex covariance functions.

verbose

Logical indicating whether to provide detailed output to the console

Returns

A matrix of samples of the random effects


Method gradient()

The gradient of the log-likelihood with respect to either the random effects or the model parameters. The random effects are on the N(0,I) scale, i.e. scaled by the Cholesky decomposition of the matrix D. To obtain the random effects from the last model fit, see member function $u

Usage

Model$gradient(y, u, beta = FALSE)

Arguments

y

Vector of outcome data

u

Vector of random effects scaled by the Cholesky decomposition of D

beta

Logical. Whether the log gradient for the random effects (FALSE) or for the linear predictor parameters (TRUE)

Returns

A vector of the gradient


Method partial_sigma()

The partial derivatives of the covariance matrix Sigma with respect to the covariance parameters. The function returns a list in order: Sigma, first order derivatives, second order derivatives. The second order derivatives are ordered as the lower-triangular matrix in column major order. Letting 'd(i)' mean the first-order partial derivative with respect to parameter i, and d2(i,j) mean the second order derivative with respect to parameters i and j, then if there were three covariance parameters the order of the output would be: (sigma, d(1), d(2), d(3), d2(1,1), d2(1,2), d2(1,3), d2(2,2), d2(2,3), d2(3,3)).

Usage

Model$partial_sigma()

Returns

A list of matrices, see description for contents of the list.


Method u()

Returns the sample of random effects from the last model fit

Usage

Model$u(scaled = TRUE)

Arguments

scaled

Logical indicating whether to return samples on the N(0,I) scale (scaled=FALSE) or N(0,D) scale (scaled=TRUE)

Returns

A matrix of random effect samples


Method log_likelihood()

The log likelihood for the GLMM. The random effects can be left unspecified. If no random effects are provided, and there was a previous model fit with the same data y then the random effects will be taken from that model. If there was no previous model fit then the random effects are assumed to be all zero.

Usage

Model$log_likelihood(y, u)

Arguments

y

A vector of outcome data

u

An optional matrix of random effect samples. This can be a single column.

Returns

The log-likelihood of the model parameters


Method clone()

The objects of this class are cloneable with this method.

Usage

Model$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Details

A generalised linear mixed model and a range of associated functions

A detailed vingette for this package is available onlinedoi:10.48550/arXiv.2303.12657. Briefly, for the generalised linear mixed model

$$Y \sim F(\mu,\sigma)$$ $$\mu = h^-1(X\beta + Zu)$$ $$u \sim MVN(0,D)$$

where h is the link function. The class provides access to all of the matrices above and associated calculations and functions including model fitting, power analysis, and various relevant decompositions. The object is an R6 class and so can serve as a parent class for extended functionality.

Many calculations use the covariance matrix of the observations, such as the information matrix, which is used in power calculations and other functions. For non-Gaussian models, the class uses the first-order approximation proposed by Breslow and Clayton (1993) based on the marginal quasilikelihood:

$$\Sigma = W^{-1} + ZDZ^T$$

where W is a diagonal matrix with the GLM iterated weights for each observation equal to, for individual i \(\left( \frac{(\partial h^{-1}(\eta_i))}{\partial \eta_i}\right) ^2 Var(y|u)\) (see Table 2.1 in McCullagh and Nelder (1989)). The modification proposed by Zegers et al to the linear predictor to improve the accuracy of approximations based on the marginal quasilikelihood is also available, see use_attenuation().

See glmmrBase for a detailed guide on model specification.

The class also includes model fitting with Markov Chain Monte Carlo Maximum Likelihood implementing the algorithms described by McCulloch (1997), and fast model fitting using Laplace approximation. Functions for returning related values such as the log gradient, log probability, and other matrices are also available.

Attenuation For calculations such as the information matrix, the first-order approximation to the covariance matrix proposed by Breslow and Clayton (1993), described above, is used. The approximation is based on the marginal quasilikelihood. Zegers, Liang, and Albert (1988) suggest that a better approximation to the marginal mean is achieved by "attenuating" the linear predictor. Setting use equal to TRUE uses this adjustment for calculations using the covariance matrix for non-linear models.

Calls the respective print methods of the linked covariance and mean function objects.

The matrices X and Z both have n rows, where n is the number of observations in the model/design.

Using update_parameters() is the preferred way of updating the parameters of the mean or covariance objects as opposed to direct assignment, e.g. self$covariance$parameters <- c(...). The function calls check functions to automatically update linked matrices with the new parameters. If using direct assignment, call self$check() afterwards.

MCMCML Fits generalised linear mixed models using one of three algorithms: Markov Chain Newton Raphson (MCNR), Markov Chain Expectation Maximisation (MCEM), or Maximum simulated likelihood (MSL). All the algorithms are described by McCullagh (1997). For each iteration of the algorithm the unobserved random effect terms (\(\gamma\)) are simulated using Markov Chain Monte Carlo (MCMC) methods (we use Hamiltonian Monte Carlo through Stan), and then these values are conditioned on in the subsequent steps to estimate the covariance parameters and the mean function parameters (\(\beta\)). For all the algorithms, the covariance parameter estimates are updated using an expectation maximisation step. For the mean function parameters you can either use a Newton Raphson step (MCNR) or an expectation maximisation step (MCEM). A simulated likelihood step can be added at the end of either MCNR or MCEM, which uses an importance sampling technique to refine the parameter estimates.

The accuracy of the algorithm depends on the user specified tolerance. For higher levels of tolerance, larger numbers of MCMC samples are likely need to sufficiently reduce Monte Carlo error.

Options for the MCMC sampler are set by changing the values in self$mcmc_options.

To provide weights for the model fitting, store them in self$weights. To set the number of trials for binomial models, set self$trials.

Laplace approximation Fits generalised linear mixed models using Laplace approximation to the log likelihood. For non-Gaussian models the covariance matrix is approximated using the first order approximation based on the marginal quasilikelihood proposed by Breslow and Clayton (1993). The marginal mean in this approximation can be further adjusted following the proposal of Zeger et al (1988), use the member function use_attenuated() in this class, see Model. To provide weights for the model fitting, store them in self$weights. To set the number of trials for binomial models, set self$trials.

References

Breslow, N. E., Clayton, D. G. (1993). Approximate Inference in Generalized Linear Mixed Models. Journal of the American Statistical Association<, 88(421), 9–25. doi:10.1080/01621459.1993.10594284

McCullagh P, Nelder JA (1989). Generalized linear models, 2nd Edition. Routledge.

McCulloch CE (1997). “Maximum Likelihood Algorithms for Generalized Linear Mixed Models.” Journal of the American statistical Association, 92(437), 162–170.doi:10.2307/2291460

Zeger, S. L., Liang, K.-Y., Albert, P. S. (1988). Models for Longitudinal Data: A Generalized Estimating Equation Approach. Biometrics, 44(4), 1049.doi:10.2307/2531734

See Also

nelder, MeanFunction, Covariance

Model, Covariance, MeanFunction

Model, Covariance, MeanFunction

Examples

Run this code

## ------------------------------------------------
## Method `Model$new`
## ------------------------------------------------

# \dontshow{
setParallel(FALSE) # for the CRAN check
# }
#create a data frame describing a cross-sectional parallel cluster
#randomised trial
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
mod <- Model$new(
  formula = ~ factor(t) + int - 1 + (1|gr(cl)) + (1|gr(cl,t)),
  data = df,
  family = stats::gaussian()
)

#here we will specify a cohort study and provide parameter values
df <- nelder(~ind(20) * t(6))
df$int <- 0
df[df$t > 3, 'int'] <- 1

des <- Model$new(
  covariance = list(
    formula = ~ (1|gr(ind)),
    parameters = c(0.05)),
  mean = list(
    formula = ~ int,
    parameters = c(1,0.5)),
  data = df,
  family = stats::poisson())

# or as
des <- Model$new(
  formula = ~ int + (1|gr(ind)),
  covariance = list(parameters = c(0.05)),
  mean = list(parameters = c(1,0.5)),
  data = df,
  family = stats::poisson()
  )


#an example of a spatial grid with two time points
df <- nelder(~ (x(10)*y(10))*t(2))
spt_design <- Model$new(covariance = list( formula = ~(1|ar0(t)*fexp(x,y))),
                         mean = list(formula = ~ 1),
                         data = df,
                         family = stats::gaussian())

## ------------------------------------------------
## Method `Model$sim_data`
## ------------------------------------------------

df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
# \dontshow{
setParallel(FALSE) # for the CRAN check
# }
des <- Model$new(
  covariance = list(
    formula = ~ (1|gr(cl)*ar0(t)),
    parameters = c(0.05,0.8)),
  mean = list(
    formula = ~ factor(t) + int - 1,
    parameters = c(rep(0,5),0.6)),
  data = df,
  family = stats::binomial()
)
ysim <- des$sim_data()

## ------------------------------------------------
## Method `Model$check`
## ------------------------------------------------

# \dontshow{
setParallel(FALSE) # for the CRAN check
# }
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Model$new(
  covariance = list(
    formula = ~ (1|gr(cl)*ar0(t)),
    parameters = c(0.05,0.8)),
  mean = list(
    formula = ~ factor(t) + int - 1,
    parameters = c(rep(0,5),0.6)),
  data = df,
  family = stats::binomial()
)
des$check() #does nothing
des$covariance$parameters <- c(0.1,0.9)
des$check() #updates
des$mean$parameters <- c(rnorm(5),0.1)
des$check() #updates

## ------------------------------------------------
## Method `Model$update_parameters`
## ------------------------------------------------

# \dontshow{
setParallel(FALSE) # for the CRAN check
# }
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Model$new(
  covariance = list(
    formula = ~ (1|gr(cl)*ar0(t))),
  mean = list(
    formula = ~ factor(t) + int - 1),
  data = df,
  family = stats::binomial()
)
des$update_parameters(cov.pars = c(0.1,0.9))

## ------------------------------------------------
## Method `Model$power`
## ------------------------------------------------

# \dontshow{
setParallel(FALSE) # for the CRAN check
# }
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Model$new(
  covariance = list(
    formula = ~ (1|gr(cl)) + (1|gr(cl,t)),
    parameters = c(0.05,0.1)),
  mean = list(
    formula = ~ factor(t) + int - 1,
    parameters = c(rep(0,5),0.6)),
  data = df,
  family = stats::gaussian(),
  var_par = 1
)
des$power() #power of 0.90 for the int parameter

## ------------------------------------------------
## Method `Model$MCML`
## ------------------------------------------------

if (FALSE) {
#create example data with six clusters, five time periods, and five people per cluster-period
df <- nelder(~(cl(6)*t(5)) > ind(5))
# parallel trial design intervention indicator
df$int <- 0
df[df$cl > 3, 'int'] <- 1 
# specify parameter values in the call for the data simulation below
des <- Model$new(
  formula= ~ factor(t) + int - 1 +(1|gr(cl)*ar0(t)),
  covariance = list(parameters = c(0.05,0.7)),
  mean = list(parameters = c(rep(0,5),0.2)),
  data = df,
  family = gaussian(),
  var_par = 1
)
ysim <- des$sim_data() # simulate some data from the model
fit1 <- des$MCML(y = ysim,method="mcnr",usestan=FALSE) # don't use Stan
#fits the models using Stan
fit2 <- des$MCML(y = ysim, method="mcnr")
 #adds a simulated likelihood step after the MCEM algorithm
fit3 <- des$MCML(y = ysim, sim.lik.step = TRUE)

 # we could use LA to find better starting values
fit4 <- des$LA(y=ysim)
# the fit parameter values are stored in the internal model class object
fit5 <- des$MCML(y = ysim, method="mcnr") # it should converge much more quickly
}

## ------------------------------------------------
## Method `Model$LA`
## ------------------------------------------------

# \dontshow{
setParallel(FALSE) # for the CRAN check
# }
#create example data with six clusters, five time periods, and five people per cluster-period
df <- nelder(~(cl(6)*t(5)) > ind(5))
# parallel trial design intervention indicator
df$int <- 0
df[df$cl > 3, 'int'] <- 1 
# specify parameter values in the call for the data simulation below
des <- Model$new(
  formula = ~ factor(t) + int - 1 + (1|gr(cl)*ar0(t)),
  covariance = list( parameters = c(0.05,0.7)),
  mean = list(parameters = c(rep(0,5),-0.2)),
  data = df,
  family = stats::binomial()
)
ysim <- des$sim_data() # simulate some data from the model
fit1 <- des$LA(y = ysim)

Run the code above in your browser using DataLab