Method use_attenuation()
Sets the model to use or not use "attenuation" when calculating the first-order approximation to
the covariance matrix.
Usage
Model$use_attenuation(use)
Arguments
use
Logical indicating whether to use "attenuation".
Returns
None. Used for effects.
Return fitted values. Does not account for the random effects. For simulated values based
on resampling random effects, see sim_data()
. To predict the values at a new location see
predict()
.
Usage
Model$fitted(type = "link", X, u)
Arguments
type
One of either "link
" for values on the scale of the link function, or "response
"
for values on the scale of the response
X
(Optional) Fixed effects matrix to generate fitted values
u
(Optional) Random effects values at which to generate fitted values
Returns
A Matrix class object containing the predicted values
Generate predictions at new values
Generates predicted values using a new data set to specify covariance
values and values for the variables that define the covariance function.
The function will return a list with the linear predictor, conditional
distribution of the new random effects term conditional on the current estimates
of the random effects, and some simulated values of the random effects if requested.
Usage
Model$predict(newdata, offset = rep(0, nrow(newdata)), m = 0)
Arguments
newdata
A data frame specifying the new data at which to generate predictions
offset
Optional vector of offset values for the new data
m
Number of samples of the random effects to draw
Returns
A list with the linear predictor, parameters (mean and covariance matrices) for
the conditional distribution of the random effects, and any random effect samples.
Method new()
Create a new Model object
Usage
Model$new(
formula,
covariance,
mean,
data = NULL,
family = NULL,
var_par = NULL,
offset = NULL,
weights = NULL,
trials = NULL,
verbose = TRUE
)
Arguments
formula
An optional model formula containing fixed and random effect terms. If not specified, then
seaparate formulae need to be provided to the covariance and mean arguments below.
covariance
Either a Covariance object, or an equivalent list of arguments
that can be passed to Covariance
to create a new object. At a minimum the list must specify a formula.
If parameters are not included then they are initialised to 0.5.
mean
Either a MeanFunction object, or an equivalent list of arguments
that can be passed to MeanFunction
to create a new object. At a minimum the list must specify a formula.
If parameters are not included then they are initialised to 0.
data
A data frame with the data required for the mean function and covariance objects. This argument
can be ignored if data are provided to the covariance or mean arguments either via Covariance
and MeanFunction
object, or as a member of the list of arguments to both covariance
and mean
.
family
A family object expressing the distribution and link function of the model, see family. This
argument is optional if the family is provided either via a MeanFunction
or MeanFunction
objects, or as members of the list of arguments to mean
. Current accepts binomial,
gaussian, Gamma, poisson, and Beta.
var_par
Scale parameter required for some distributions, including Gaussian. Default is NULL.
offset
A vector of offset values. Optional - could be provided to the argument to mean instead.
weights
A vector of weights. Optional..
trials
For binomial family models, the number of trials for each observation. If it is not set, then it will
default to 1 (a bernoulli model).
verbose
Logical indicating whether to provide detailed output
Returns
A new Model class object
Examples
\dontshow{
setParallel(FALSE) # for the CRAN check
}
#create a data frame describing a cross-sectional parallel cluster
#randomised trial
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
mod <- Model$new(
formula = ~ factor(t) + int - 1 + (1|gr(cl)) + (1|gr(cl,t)),
data = df,
family = stats::gaussian()
)#here we will specify a cohort study and provide parameter values
df <- nelder(~ind(20) * t(6))
df$int <- 0
df[df$t > 3, 'int'] <- 1
des <- Model$new(
covariance = list(
formula = ~ (1|gr(ind)),
parameters = c(0.05)),
mean = list(
formula = ~ int,
parameters = c(1,0.5)),
data = df,
family = stats::poisson())
# or as
des <- Model$new(
formula = ~ int + (1|gr(ind)),
covariance = list(parameters = c(0.05)),
mean = list(parameters = c(1,0.5)),
data = df,
family = stats::poisson()
)
#an example of a spatial grid with two time points
df <- nelder(~ (x(10)*y(10))*t(2))
spt_design <- Model$new(covariance = list( formula = ~(1|ar0(t)*fexp(x,y))),
mean = list(formula = ~ 1),
data = df,
family = stats::gaussian())
Print method for Model
class
Method n()
Returns the number of observations in the model
Method subset_rows()
Subsets the design keeping specified observations only
Given a vector of row indices, the corresponding rows will be kept and the
other rows will be removed from the mean function and covariance
Usage
Model$subset_rows(index)
Arguments
index
Integer or vector integers listing the rows to keep
Returns
The function updates the object and nothing is returned
Method subset_cols()
Subsets the columns of the design
Removes the specified columns from the linked mean function object's X matrix.
Usage
Model$subset_cols(index)
Arguments
index
Integer or vector of integers specifying the indexes of the columns to keep
Returns
The function updates the object and nothing is returned
Method sim_data()
Generates a realisation of the design
Generates a single vector of outcome data based upon the
specified GLMM design.
Usage
Model$sim_data(type = "y")
Arguments
type
Either 'y' to return just the outcome data, 'data'
to return a data frame with the simulated outcome data alongside the model data,
or 'all', which will return a list with simulated outcomes y, matrices X and Z,
parameters beta, and the values of the simulated random effects.
Returns
Either a vector, a data frame, or a list
Examples
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
\dontshow{
setParallel(FALSE) # for the CRAN check
}
des <- Model$new(
covariance = list(
formula = ~ (1|gr(cl)*ar0(t)),
parameters = c(0.05,0.8)),
mean = list(
formula = ~ factor(t) + int - 1,
parameters = c(rep(0,5),0.6)),
data = df,
family = stats::binomial()
)
ysim <- des$sim_data()
Method check()
Checks for any changes in linked objects and updates.
Checks for any changes in any object and updates all linked objects if
any are detected. Generally called automatically and not required by the user.
Usage
Model$check(verbose = TRUE)
Arguments
verbose
Logical indicating whether to report if any updates are made, defaults to TRUE
Returns
Linked objects are updated by nothing is returned
Examples
\dontshow{
setParallel(FALSE) # for the CRAN check
}
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Model$new(
covariance = list(
formula = ~ (1|gr(cl)*ar0(t)),
parameters = c(0.05,0.8)),
mean = list(
formula = ~ factor(t) + int - 1,
parameters = c(rep(0,5),0.6)),
data = df,
family = stats::binomial()
)
des$check() #does nothing
des$covariance$parameters <- c(0.1,0.9)
des$check() #updates
des$mean$parameters <- c(rnorm(5),0.1)
des$check() #updates
Method update_parameters()
Updates the parameters of the mean function and/or the covariance function
Usage
Model$update_parameters(mean.pars = NULL, cov.pars = NULL, var.par = NULL)
Arguments
mean.pars
(Optional) Vector of new mean function parameters
cov.pars
(Optional) Vector of new covariance function(s) parameters
var.par
(Optional) A scalar value for var_par
verbose
Logical indicating whether to provide more detailed feedback
Examples
\dontshow{
setParallel(FALSE) # for the CRAN check
}
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Model$new(
covariance = list(
formula = ~ (1|gr(cl)*ar0(t))),
mean = list(
formula = ~ factor(t) + int - 1),
data = df,
family = stats::binomial()
)
des$update_parameters(cov.pars = c(0.1,0.9))
Arguments
include.re
logical indicating whether to return the information matrix including the random effects components (TRUE),
or the GLS information matrix for beta only.
Method sandwich()
Returns the robust sandwich variance-covariance matrix for the fixed effect parameters
Method kenward_roger()
Returns the bias-corrected variance-covariance matrix for the fixed effect parameters.
Usage
Model$kenward_roger()
Estimates the power of the design described by the model using the square root
of the relevant element of the GLS variance matrix:
$$(X^T\Sigma^{-1}X)^{-1}$$
Note that this is equivalent to using the "design effect" for many
models.
Usage
Model$power(alpha = 0.05, two.sided = TRUE, alternative = "pos")
Arguments
alpha
Numeric between zero and one indicating the type I error rate.
Default of 0.05.
two.sided
Logical indicating whether to use a two sided test
alternative
For a one-sided test whether the alternative hypothesis is that the
parameter is positive "pos" or negative "neg"
Returns
A data frame describing the parameters, their values, expected standard
errors and estimated power.
Examples
\dontshow{
setParallel(FALSE) # for the CRAN check
}
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Model$new(
covariance = list(
formula = ~ (1|gr(cl)) + (1|gr(cl,t)),
parameters = c(0.05,0.1)),
mean = list(
formula = ~ factor(t) + int - 1,
parameters = c(rep(0,5),0.6)),
data = df,
family = stats::gaussian(),
var_par = 1
)
des$power() #power of 0.90 for the int parameter
Method w_matrix()
Returns the diagonal of the matrix W used to calculate the covariance matrix approximation
Returns
A vector with values of the glm iterated weights
Method dh_deta()
Returns the derivative of the link function with respect to the linear preditor
Method Sigma()
Returns the (approximate) covariance matrix of y
Returns the covariance matrix Sigma. For non-linear models this is an approximation. See Details.
Usage
Model$Sigma(inverse = FALSE)
Arguments
inverse
Logical indicating whether to provide the covariance matrix or its inverse
Method MCML()
Markov Chain Monte Carlo Maximum Likelihood model fitting
Usage
Model$MCML(
y,
method = "mcnr",
sim.lik.step = FALSE,
verbose = TRUE,
tol = 0.01,
max.iter = 30,
se = "gls",
sparse = FALSE,
usestan = TRUE,
se.theta = TRUE
)
Arguments
y
A numeric vector of outcome data
method
The MCML algorithm to use, either mcem
or mcnr
, see Details. Default is mcem
.
sim.lik.step
Logical. Either TRUE (conduct a simulated likelihood step at the end of the algorithm), or FALSE (does
not do this step), defaults to FALSE.
verbose
Logical indicating whether to provide detailed output, defaults to TRUE.
tol
Numeric value, tolerance of the MCML algorithm, the maximum difference in parameter estimates
between iterations at which to stop the algorithm.
max.iter
Integer. The maximum number of iterations of the MCML algorithm.
se
String. Type of standard error to return. Options are "gls" for GLS standard errors (the default),
"robust" for Huber robust sandwich estimator, "kr" for Kenward-Roger bias corrected standard errors, "bw" to use
GLS standard errors with a between-within correction to the degrees of freedom, "bwrobust" to use robust
standard errors with between-within correction to the degrees of freedom.
sparse
Logical indicating whether to use sparse matrix methods
usestan
Logical whether to use Stan (through the package cmdstanr
) for the MCMC sampling. If FALSE then
the internal Hamiltonian Monte Carlo sampler will be used instead. We recommend Stan over the internal sampler as
it generally produces a larger number of effective samplers per unit time, especially for more complex
covariance functions.
se.theta
Logical. Whether to calculate the standard errors for the covariance parameters. This step is a slow part
of the calculation, so can be disabled if required in larger models. Has no effect for Kenward-Roger standard errors.
Examples
\dontrun{
#create example data with six clusters, five time periods, and five people per cluster-period
df <- nelder(~(cl(6)*t(5)) > ind(5))
# parallel trial design intervention indicator
df$int <- 0
df[df$cl > 3, 'int'] <- 1
# specify parameter values in the call for the data simulation below
des <- Model$new(
formula= ~ factor(t) + int - 1 +(1|gr(cl)*ar0(t)),
covariance = list(parameters = c(0.05,0.7)),
mean = list(parameters = c(rep(0,5),0.2)),
data = df,
family = gaussian(),
var_par = 1
)
ysim <- des$sim_data() # simulate some data from the model
fit1 <- des$MCML(y = ysim,method="mcnr",usestan=FALSE) # don't use Stan
#fits the models using Stan
fit2 <- des$MCML(y = ysim, method="mcnr")
#adds a simulated likelihood step after the MCEM algorithm
fit3 <- des$MCML(y = ysim, sim.lik.step = TRUE) # we could use LA to find better starting values
fit4 <- des$LA(y=ysim)
# the fit parameter values are stored in the internal model class object
fit5 <- des$MCML(y = ysim, method="mcnr") # it should converge much more quickly
}
Method LA()
Maximum Likelihood model fitting with Laplace Approximation
Usage
Model$LA(
y,
start,
method = "nr",
verbose = FALSE,
se = "gls",
max.iter = 40,
tol = 1e-04,
se.theta = TRUE
)
Arguments
y
A numeric vector of outcome data
start
Optional. A numeric vector indicating starting values for the model parameters.
method
String. Either "nloptim" for non-linear optimisation, or "nr" for Newton-Raphson (default) algorithm
verbose
logical indicating whether to provide detailed algorithm feedback (default is TRUE).
se
String. Type of standard error to return. Options are "gls" for GLS standard errors (the default),
"robust" for Huber robust sandwich estimator, "kr" for Kenward-Roger bias corrected standard errors, "bw" to use
GLS standard errors with a between-within correction to the degrees of freedom, "bwrobust" to use robust
standard errors with between-within correction to the degrees of freedom.
max.iter
Maximum number of algorithm iterations, default 20.
tol
Maximum difference between successive iterations at which to terminate the algorithm
se.theta
Logical. Whether to calculate the standard errors for the covariance parameters. This step is a slow part
of the calculation, so can be disabled if required in larger models. Has no effect for Kenward-Roger standard errors.
Examples
\dontshow{
setParallel(FALSE) # for the CRAN check
}
#create example data with six clusters, five time periods, and five people per cluster-period
df <- nelder(~(cl(6)*t(5)) > ind(5))
# parallel trial design intervention indicator
df$int <- 0
df[df$cl > 3, 'int'] <- 1
# specify parameter values in the call for the data simulation below
des <- Model$new(
formula = ~ factor(t) + int - 1 + (1|gr(cl)*ar0(t)),
covariance = list( parameters = c(0.05,0.7)),
mean = list(parameters = c(rep(0,5),-0.2)),
data = df,
family = stats::binomial()
)
ysim <- des$sim_data() # simulate some data from the model
fit1 <- des$LA(y = ysim)
Method sparse()
Set whether to use sparse matrix methods for model calculations and fitting.
By default the model does not use sparse matrix methods.
Usage
Model$sparse(sparse = TRUE)
Arguments
sparse
Logical indicating whether to use sparse matrix methods
Returns
None, called for effects
Method mcmc_sample()
Generate an MCMC sample of the random effects
Usage
Model$mcmc_sample(y, usestan = TRUE, verbose = TRUE)
Arguments
y
Numeric vector of outcome data
usestan
Logical whether to use Stan (through the package cmdstanr
) for the MCMC sampling. If FALSE then
the internal Hamiltonian Monte Carlo sampler will be used instead. We recommend Stan over the internal sampler as
it generally produces a larger number of effective samplers per unit time, especially for more complex
covariance functions.
verbose
Logical indicating whether to provide detailed output to the console
Returns
A matrix of samples of the random effects
Method gradient()
The gradient of the log-likelihood with respect to either the random effects or
the model parameters. The random effects are on the N(0,I) scale, i.e. scaled by the
Cholesky decomposition of the matrix D. To obtain the random effects from the last
model fit, see member function $u
Usage
Model$gradient(y, u, beta = FALSE)
Arguments
y
Vector of outcome data
u
Vector of random effects scaled by the Cholesky decomposition of D
beta
Logical. Whether the log gradient for the random effects (FALSE) or for the linear predictor parameters (TRUE)
Returns
A vector of the gradient
Method partial_sigma()
The partial derivatives of the covariance matrix Sigma with respect to the covariance
parameters. The function returns a list in order: Sigma, first order derivatives, second
order derivatives. The second order derivatives are ordered as the lower-triangular matrix
in column major order. Letting 'd(i)' mean the first-order partial derivative with respect
to parameter i, and d2(i,j) mean the second order derivative with respect to parameters i
and j, then if there were three covariance parameters the order of the output would be:
(sigma, d(1), d(2), d(3), d2(1,1), d2(1,2), d2(1,3), d2(2,2), d2(2,3), d2(3,3)).
Usage
Model$partial_sigma()
Returns
A list of matrices, see description for contents of the list.
Method u()
Returns the sample of random effects from the last model fit
Usage
Model$u(scaled = TRUE)
Arguments
scaled
Logical indicating whether to return samples on the N(0,I) scale (scaled=FALSE
) or
N(0,D) scale (scaled=TRUE
)
Returns
A matrix of random effect samples
Method log_likelihood()
The log likelihood for the GLMM. The random effects can be left
unspecified. If no random effects are provided, and there was a previous model fit with the same data y
then the random effects will be taken from that model. If there was no
previous model fit then the random effects are assumed to be all zero.
Usage
Model$log_likelihood(y, u)
Arguments
y
A vector of outcome data
u
An optional matrix of random effect samples. This can be a single column.
Returns
The log-likelihood of the model parameters
Method clone()
The objects of this class are cloneable with this method.
Usage
Model$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.