write.MCMC is an internal function which creates an MLwiN macro file to fit models using MCMC.
write.MCMC(
indata,
dtafile,
oldsyntax = FALSE,
resp,
levID,
expl,
rp,
D = "Normal",
nonlinear = c(0, 1),
categ = NULL,
notation = NULL,
nonfp = NULL,
clre = NULL,
Meth = 1,
merr = NULL,
carcentre = FALSE,
maxiter = 20,
convtol = 2,
seed = 1,
iterations = 5000,
burnin = 500,
scale = 5.8,
thinning = 1,
priorParam = "default",
refresh = 50,
fixM = 1,
residM = 1,
Lev1VarM = 1,
OtherVarM = 1,
adaption = 1,
priorcode = c(gamma = 1),
rate = 50,
tol = 10,
lclo = 0,
mcmcOptions,
fact = NULL,
xc = NULL,
mm = NULL,
car = NULL,
BUGO = NULL,
mem.init = "default",
optimat = FALSE,
modelfile,
initfile,
datafile,
macrofile,
IGLSfile,
MCMCfile,
chainfile,
MIfile,
resifile,
resi.store = FALSE,
resioptions,
resichains,
FACTchainfile,
resi.store.levs = NULL,
debugmode = FALSE,
startval = NULL,
dami = NULL,
namemap = sapply(colnames(indata), as.character),
saveworksheet = NULL,
rng.version = 10
)
Outputs a modified version of namemap containing newly generated short names.
A data.frame object containing the data to be modelled.
The file name of the dataset to be imported into MLwiN, which is in Stata format (i.e. with extension .dta).
Specified as FALSE
if new syntax has been used in
Formula
object, else specified as TRUE
(to enable
backcompatibility).
A character string (vector) of response variable(s).
A character string (vector) of the specified level ID(s). The
ID(s) should be sorted in the descending order of levels (e.g.
levID = c('level2', 'level1')
where 'level2'
is the higher
level).
A character string (vector) of explanatory (predictor) variable(s).
A character string (vector) of random part of random variable(s).
A character string/vector specifying the type of distribution to be modelled, which
can include 'Normal'
(the default), 'Binomial'
, 'Poisson'
,
'Unordered Multinomial'
, 'Ordered Multinomial'
,
'Multivariate Normal'
, or 'Mixed'
. In the case of the latter,
'Mixed'
precedes the response types which also need to be listed in
D
, e.g. c('Mixed', 'Normal', 'Binomial')
; these need to be
be listed in the same order to which they are referred to in the
Formula
object (see runMLwiN
, Formula.translate
,
Formula.translate.compat
. 'Mixed'
combinations can only consist of
'Normal'
and 'Binomial'
for MCMC estimation.
A character vector specifying linearisation method for IGLS
starting values for discrete
response models (see Chapter 9 of Rasbash et al 2012, and Goldstein 2011).
N = 0
specifies marginal quasi-likelihood
linearization (MQL), whilst N = 1
specifies penalised quasi-
likelihood linearization (PQL); M = 1
specifies first order
approximation, whilst M = 2
specifies second order approximation.
nonlinear = c(N = 0, M = 1)
by default. First order marginal
quasi-likelihood (MQL1) only option for single-level discrete response
models.
Specifies categorical variable(s) as a matrix. Each column
corresponds to a categorical variable; the first row specifies the name(s)
of variable(s); the second row specifies the name(s) of reference group(s),
NA
(s) if no reference group; the third row states the number of
categories for each variable.
Specifies the model subscript notation to be used in the
MLwiN equations window. 'class'
means no multiple subscripts, whereas
'level'
has multiple subscripts.
Removes the fixed part of random variable(s). NA
if no
variable is removed.
A matrix used to estimate some, but not all, of the variances and covariances for a set of coefficients at a particular level. Remove from the random part at level <first row> the covariance matrix element(s) defined by the pair(s) of rows <second row> <third row>. Each row corresponds to a removed entry of the covariance matrix.
Specifies the maximum likelihood estimation method to be used
when generating starting values via (R)IGLS.
If Meth = 0
estimation method is set to RIGLS. If Meth = 1
estimation method is set to IGLS (the default setting). If Meth
is
absent, alternate between IGLS and RIGLS.
A vector which sets-up measurement errors on predictor
variables. The first element N
defines the number of variables that
have measurement errors. Then, for each variable with measurement error, a
pair of inputs is required: the first of which is the explanatory variable
name as a character string, and the second of which is the variance of
the measurement error for this variable. See demo(MCMCGuide14)
for an
example.
If CAR model (i.e. if car
is non-NULL
),
carcentre = TRUE
mean-centres all random effects at that level.
When generating starting values via (R)IGLS, a numeric
value specifying the total number of iterations, from
the start, before IGLS estimation halts (if startval = NULL
).
When generating starting values via (R)IGLS, a numeric
value specifying the IGLS convergence criterion, as
specified in the tol
option within estoptions
, where
startval = NULL
) (see runMLwiN
). If value of
convtol
is m, estimation will be deemed to have converged when the
relative change in the estimate for any parameter from one iteration to the
next is less than 10(-m). Defaults to value of 2
for m if not
otherwise specified.
An integer specifying the random seed in MLwiN.
An integer specifying the number of iterations after burn-in.
An integer specifying length of the burn-in.
An integer specifying the scale factor of proposed variances; this number will be multiplied by the estimated parameter variance (from IGLS/RIGLS) to give the proposal distribution variance.
An integer specifying the frequency with which successive
values in the Markov chain are stored. By default thinning = 1
.
A vector specifying the informative priors used, as output
from prior2macro
.
An integer specifying how frequently the parameter estimates
are refreshed on the screen during iterations; only applies if
debugmode = TRUE
in estoptions
:
see runMLwiN
.
Specifies the fixed effect method: 1
for Gibbs Sampling,
2
for univariate MH Sampling and 3
for multivariate MH
Sampling.
Specifies the residual method: 1
for Gibbs Sampling,
2
for univariate MH Sampling and 3
for multivariate MH
Sampling.
Specifies the level 1 variance method: 1
for Gibbs
Sampling, 2
for univariate MH Sampling and 3
for multivariate
MH Sampling.
Specifies the variance method for other levels: 1
for Gibbs Sampling, 2
for univariate MH Sampling and 3
for
multivariate MH Sampling.
adaption = 1
indicates adaptation is to be used;
0
otherwise.
A vector indicating which default priors are to be used
for the variance parameters. It defaults to c(gamma = 1)
in which case
Gamma priors are used with MLwiN's defaults of Gamma a value (shape) = 0.001
and Gamma b value (scale) = 0.001, although alternative values for shape and
scale can be specified in subsequent elements of the vector,
e.g. c(gamma = 1, shape = 0.5, scale = 0.2)
). Alternatively
c(uniform = 1)
specifies Uniform priors on the variance scale. To allow
for back-compatibility with deprecated syntax used in versions of
R2MLwiN prior to 0.8-2, if priorcode
is instead specified as
an integer, then 1
indicates that Gamma priors are used, whereas
0
indicates that Uniform priors are used. See the section on 'Priors' in the
MLwiN help system for more details on the meaning of these priors.
An integer specifying the acceptance rate (as a percentage);
this command is ignored if adaption = 0
.
An integer specifying tolerance (as a percentage) for the acceptance rate.
This command toggles on/off the possible forms of complex level
1 variation when using MCMC. lclo = 0
expresses the level
1 variation as a function of the predictors, whereas lclo = 1
expresses the
log of the level 1 precision (1/variance) as a function of the predictors.
A list of other MCMC options used. See `Details' below.
A list of objects specified for factor analysis. See `Details' below.
Indicates whether model is cross-classified (TRUE
) or
nested (FALSE
). xc = NULL
by default (corresponding to
FALSE
), unless either mm
or car
are not null, in
which case xc = TRUE
.
Specifies the structure of a multiple membership model.
Can be a list of variable names, a list of vectors, or a matrix (e.g. see
df2matrix
). In the case of the former, each
element of the list corresponds to a level (classification) of the model,
in descending order. If a level is not a multiple membership classification,
then NA
is specified. Otherwise, lists need to be assigned to
mmvar
and weights
, with the former containing columns
specifying the classification units, and the latter containing columns
specifying the weights. Ignored if EstM = 0
, i.e. only applicable to models estimated via
MCMC. mm = NULL
by default. Supersedes deprecated xclass
.
E.g. (from demo(MCMCGuide16)
) for
logearn ~ 1 + age_40 + sex + parttime + (company|1) + (id|1)
, if
company
is a multiple membership classification with the variables
indicating the classifications in company
, company2
,
company3
, company4
and their weights in weight1
, weight2
,
weight3
and weight4
then
mm = list(list(mmvar = list('company', 'company2', 'company3', 'company4'),
weights = list('weight1', 'weight2', 'weight3', 'weight4')), NA)
with the NA
, listed last, corresponding to the level 1 identifier (id
).
A list specifying structure of a conditional autoregressive (CAR)
model. Each element of the list corresponds to a level (classification) of
the model, in descending order. If a level is not a spatial classification,
then NA
is specified. Otherwise, lists need to be assigned to
carvar
and weights
, with the former containing columns
specifying the spatial classification units, and the latter containing
columns specifying the weights. See demo(MCMCGuide17)
for examples.
car = NULL
by default.
If non-NULL
uses BUGS for MCMC estimation using files
specified in modelfile
, initfile
and datafile
.
A vector which sets and displays worksheet capacities for
the current MLwiN session according to the value(s) specified. By default,
the number of levels is nlev
+1; worksheet size in thousands of cells
is 6000; the number of columns is 2500; the number of explanatory variables
is num_vars
+10; the number of group labels is 20. nlev
is the
number of levels specified by levID
, and num_vars
is
approximately the number of explanatory variables calculated initially.
This option instructs MLwiN to limit the maximum matrix size
that can be allocated by the (R)IGLS algorithm. Specify optimat = TRUE
if MLwiN gives the following error message 'Overflow allocating smatrix'.
This error message arises if one more higher-level units is extremely large
(contains more than 800 lower-level units). In this situation runmlwin's
default behaviour is to instruct MLwiN to allocate a larger matrix size to
the (R)IGLS algorithm than is currently possible. Specifying
optimat = TRUE
caps the maximum matrix size at 800 lower-level units,
circumventing the MLwiN error message, and allowing most MLwiN
functionality.
A file name where the BUGS model will be saved in .txt format.
A file name where the BUGS initial values will be saved in .txt format.
A file name where the BUGS data will be saved in .txt format.
A file name where the MLwiN macro file will be saved.
A file name where the IGLS estimates will be saved.
A file name where the MCMC estimates will be saved.
A file name where the MCMC chains will be saved.
A file name where the missing values will be saved.
A file name where the residual estimates will be saved.
A logical value to indicate if residuals are to be stored
(TRUE
) or not (FALSE
).
A string vector to specify the various residual options.
The 'variance'
option calculates the posterior variances instead of
the posterior standard errors; the 'standardised'
option calculates standardised
residuals.
A file name where the residual chains will be saved.
A file name where the factor chains will be saved.
An integer vector indicating the levels at which the residual chains are to be stored.
A logical value determining whether MLwiN is run in the
background or not. The default value is FALSE
: i.e. MLwiN is run in
the background. If TRUE
the MLwiN GUI is opened, and then pauses after the model
has been set-up, allowing user to check starting values; pressing 'Resume macro'
will then fit the model. Once fit, pressing 'Resume macro' once more will save
the outputs to the workdir
ready to be read by R2MLwiN. Users can
instead opt to 'Abort macro' in which case the outputs are not saved to the
workdir
. This option currently
works for 32 bit version of MLwiN only (automatically switches unless
MLwiNPath
or options(MLwiNPath)
has been set directly to the executable).
A list of numeric vectors specifying the starting values
when using MCMC. FP.b
corresponds to the estimates for the fixed
part; FP.v
specifies the variance/covariance estimates for the fixed
part; RP.b
specifies the variance estimates for the random part;
RP.v
corresponds to the variance/covariance matrix of the variance
estimates for the random part.
This command outputs a complete (i.e. including non-missing
responses) response variable y. If dami = c(0, <iter1>, <iter2>,...)
then
the response variables returned will be the value of y at the iterations
quoted (as integers <iter1>, <iter2>
, etc.); these can be used for
multiple imputation. If dami = 1
the value of y will be the mean
estimate from the iterations produced. dami = 2
is as for dami = 1
but with the standard errors of the estimate additionally being stored.
A mapping of column names to DTA friendly shorter names
A list of file names (one for each chain) used to store the MLwiN worksheet after the model has been estimated.
An integer indicating which random number generator should be used by MLwiN.
Zhang, Z., Charlton, C.M.J., Parker, R.M.A., Leckie, G., and Browne, W.J. (2016) Centre for Multilevel Modelling, University of Bristol.
A list of other MCMC options as used in the argument
mcmcOptions
:
orth
: If orth = 1
, orthogonal fixed effect
vectors are used; zero otherwise.
hcen
: An integer specifying the
level where we use hierarchical centering.
smcm
: If smcm = 1
,
structured MCMC is used; zero otherwise.
smvn
: If smvn = 1
, the
structured MVN framework is used; zero otherwise.
paex
: A matrix of Nx2; in each row, if the second digit is 1
, parameter expansion
is used at level <the first digit>.
mcco
: This
command allows the user to have constrained settings for the lowest level
variance matrix in a multivariate Normal model. If value is 0
,
it estimates distinct variances for each residual error and distinct covariances
for each residual error pair. Four other
settings are currently available:
1 | fits stuctured errors with a common correlation paramater and a common variance parameter; |
2 | fits AR1 errors with a common variance parameter; |
3 | fits structured errors with a common correlation parameter and independent variance parameters; |
4 | fits AR1 errors with independent variance parameters. |
A list of objects specified for cross-classified and/or multiple membership
models, as used in the argument xclass
:
class
: An integer
(vector) of the specified class(es).
N1
: This defines a multiple
membership across N1
units at level class
. N1
>1 if
there is multiple membership.
weight
: If there is multiple
membership then the column number weight
, which is the length of the
dataset, will contain the first set of weights for the multiple membership.
Note that there should be N1
weight columns and they should be
sequential in the worksheet starting from weight
.
id
: If the
response is multivariate then the column number id
must be input and
this contains the first set of identifiers for the classification. Note that
for a p-variate model each lowest level unit contains p records and the
identifiers (sequence numbers) for each response variate need to be
extracted into id
and following columns. There should be N1
of
these identifier columns and they should be sequential starting from
id
in the multivariate case.
car
: car = TRUE
indicates
the spatial CAR model; FALSE
otherwise. car = FALSE
if ignored.
A list of objects specified for factor analysis, as used in the argument
fact
:
nfact
: Specifies the number of factors
lev.fact
: Specifies the level/classification for the random part of
the factor for each factor.
nfactcor
: Specifies the number of
correlated factors
factcor
: a vector specifying the correlated
factors: the first element corresponds to the first factor number, the
second to the second factor number, the third element corresponds to the
starting value for the covariance and the fourth element to whether this
covariance is constrained
(1
) or not (0
). If more than one pair of factors is correlated,
then repeat this sequence for each pair.
loading
: A matrix specifying the
starting values for the factor loadings and the starting value of the factor
variance. Each row corresponds to a factor.
constr
: A matrix
specifying indicators of whether the factor loadings and the factor variance
are constrained (1
) or not (0
).
Goldstein, H. (2011) Multilevel Statistical Models. 4th Edition. London: John Wiley and Sons.
Rasbash, J., Steele, F., Browne, W.J. and Goldstein, H. (2012) A User's Guide to MLwiN Version 2.26. Centre for Multilevel Modelling, University of Bristol.
write.IGLS