simsum: Analyses of simulation studies including Monte Carlo error

Description

simsum computes performance measures for simulation studies in which each simulated data set yields point estimates by one or more analysis methods. Bias, empirical standard error and precision relative to a reference method can be computed for each method. If, in addition, model-based standard errors are available then simsum can compute the average model-based standard error, the relative error in the model-based standard error, the coverage of nominal confidence intervals, and the power to reject a null hypothesis. Monte Carlo errors are available for all estimated quantities.

Usage

simsum(data, estvarname, true, se, methodvar = NULL, ref = NULL,
  df = NULL, dropbig = FALSE, max = 10, semax = 100, level = 0.95,
  by = NULL, mcse = TRUE, sanitise = TRUE, na.rm = TRUE,
  na.pair = TRUE, x = FALSE)

Arguments

data

A data.frame in which variable names are interpreted. It has to be in tidy format, e.g. each variable forms a column and each observation forms a row.

estvarname

The name of the variable containing the point estimates.

true

The true value of the parameter. This is used in calculations of bias and coverage.

The name of the variable containing the standard errors of the point estimates.

methodvar

The name of the variable containing the methods to compare. For instance, methods could be the models compared within a simulation study. Can be NULL.

ref

Specifies the reference method against which relative precision will be calculated. Only useful if methodvar is specified.

If specified, a t distribution with df degrees of freedom is used when calculating coverage and power.

dropbig

Specifies that point estimates or standard errors beyond the maximum acceptable values should be dropped.

max

Specifies the maximum acceptable absolute value of the point estimates, standardised to mean 0 and SD 1. Defaults to 10.

semax

Specifies the maximum acceptable value of the standard error, as a multiple of the mean standard error. Defaults to 100.

level

Specifies the confidence level for coverage and power. Defaults to 0.95.

A vector of variable names to compute performance measures by a list of factors. Factors listed here are the (potentially several) data-generating mechanisms used to simulate data under different scenarios (e.g. sample size, true distribution of a variable, etc.). Can be NULL.

mcse

Reports Monte Carlo standard errors for all performance measures. Defaults to TRUE.

sanitise

Sanitise column names passed to simsum by removing all dot characters (.), which could cause problems. Defaults to TRUE.

na.rm

A logical value indicating whether missing values (NA) should be removed before the computation proceeds. Defaults to TRUE.

na.pair

Removes estimates that have a missing standard error (and vice versa). Defaults to TRUE.

Set to TRUE to include the data argument (as utilised to compute summary statistics, i.e. applying dropbig, na.rm, na.pair) as a slot. Defaults to FALSE.

Value

An object of class simsum.

Details

The following names are not allowed for estvarname, se, methodvar, by: stat, est, mcse, lower, upper. Calling the function with x = TRUE is required to produce zip plots (e.g. via the zip() method). The downside is that the size of the returned object increases considerably, therefore it is set to FALSE by default. Please note that the data slot returned when x = TRUE is obtained according to the value of the arguments dropbig, na.rm, na.pair; all rows with missing values are removed via a call to stats::na.omit().

References

White, I.R. 2010. simsum: Analyses of simulation studies including Monte Carlo error. The Stata Journal 10(3): 369-385. http://www.stata-journal.com/article.html?article=st0200

Morris, T.P, White, I.R. and Crowther, M.J. 2017. Using simulation studies to evaluate statistical methods. arXiv:1712.03198

Examples

Run this code

# NOT RUN {
data("MIsim")
s <- simsum(data = MIsim, estvarname = "b", true = 0.5, se = "se", methodvar = "method", ref = "CC")
# If `ref` is not specified, the reference method is inferred
s <- simsum(data = MIsim, estvarname = "b", true = 0.5, se = "se", methodvar = "method")
# }

Run the code above in your browser using DataLab