Learn R Programming

MIWilson

The goal of MIWilson is to implement the Wilson confidence interval for binomial proportions given multiple imputations of missing data.

Installation

You can install the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("hungf8342/MIWilson")

Basic Usage: MIDs Argument

The functions which calculate confidence intervals (mi_wilson and mi_wald) take in a mids object (produced by the mice package), the binary response variable name (the response must be 0-1 valued), a summaries print option (default is TRUE), and a confidence level (default is 0.95).

As an example, let’s work with the hypertension (hyp) variable in the nhanes toy dataset. The hyp variable refers to whether an individual has hypertension. It originally has no and yes encoded as 1 and 2, so we need to first re-value the hyp variable. Calculating Wilson and Wald intervals for hyp is straightfoward afterwards.

library(MIWilson)

## setting up correct response variable values and mice object
nhanes = mice::nhanes %>%
  dplyr::mutate(hyp = hyp-1)
imp = mice::mice(nhanes)
#> 
#>  iter imp variable
#>   1   1  bmi  hyp  chl
#>   1   2  bmi  hyp  chl
#>   1   3  bmi  hyp  chl
#>   1   4  bmi  hyp  chl
#>   1   5  bmi  hyp  chl
#>   2   1  bmi  hyp  chl
#>   2   2  bmi  hyp  chl
#>   2   3  bmi  hyp  chl
#>   2   4  bmi  hyp  chl
#>   2   5  bmi  hyp  chl
#>   3   1  bmi  hyp  chl
#>   3   2  bmi  hyp  chl
#>   3   3  bmi  hyp  chl
#>   3   4  bmi  hyp  chl
#>   3   5  bmi  hyp  chl
#>   4   1  bmi  hyp  chl
#>   4   2  bmi  hyp  chl
#>   4   3  bmi  hyp  chl
#>   4   4  bmi  hyp  chl
#>   4   5  bmi  hyp  chl
#>   5   1  bmi  hyp  chl
#>   5   2  bmi  hyp  chl
#>   5   3  bmi  hyp  chl
#>   5   4  bmi  hyp  chl
#>   5   5  bmi  hyp  chl

## MI-Wilson and MI-Wald 99% CIs of the proportion of patients with hypertension 
mi_wilson(imp,"hyp", 0.99)
#> [1] "Qbar:  0.208"
#> [1] "Rm:  0.205078125"
#> [1] "dof:  138.118458049887"
#> [1] 0.0733112 0.4657681
mi_wald(imp, "hyp", 0.99)
#> [1] "Qbar:  0.208"
#> [1] "Tm:  0.0078976"
#> [1] "dof:  138.118458049887"
#> [1] -0.001165153  0.417165153

Helper functions (other than Qhats) do not take in mids objects directly since they can be calculated using Qhats (the mean proportions of each imputed dataset).

## Qhats calculates the mean proportion of hyp for each imputed dataset
qhats = Qhats(imp,"hyp")

## Keeping track of number of imputed datasets (m) and number of observations in a dataset
m = imp$m
nrow = imp$data %>% nrow()

## Qbar (mean of Qhats) and Ubar (average response variance over imputed datasets)
Qbar(qhats)
#> [1] 0.208
Ubar(qhats, m, nrow)
#> [1] 0.0065536

Basic Usage: P-hats Argument

If the user provides their own imputed datasets (rather than relying on the mice package), MI-Wilson allows them to input the corresponding vector of observed binomial proportions instead of a mids object. mi_wilson_phat and mi_wald_phat take in a vector of observed binomial proportions (one proportion for each of (m) imputed datasets), the number of total observations (must be constant across datasets), a summaries print option (default is TRUE), and a confidence level (default is 0.95).

mi_wilson(phats = c(0.2,0.23,0.22), n = 10)
#> [1] "Qbar:  0.216666666666667"
#> [1] "Rm:  0.0183474215320097"
#> [1] "dof:  6161.29288265307"
#> [1] 0.07684304 0.47892199
mi_wilson_phat(c(0,0,0),n=10)
#> Warning in mi_wilson_phat(c(0, 0, 0), n = 10): Imputed binomial proportions are
#> identical; degrees of freedom set to infinity.
#> [1] "Qbar:  0"
#> [1] "Rm:  0"
#> [1] "dof:  Inf"
#> [1] 1.387779e-17 2.129420e-01

Copy Link

Version

Install

install.packages('MIWilson')

Monthly Downloads

34

Version

1.0.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Frances Hung

Last Published

August 23rd, 2021

Functions in MIWilson (1.0.0)

Uhats

Calculate Uhats (variance for each imputed dataset)
Tm

Estimate variance of proportion point estimate \(\bar{Q}_m\)
mi_wald

Calculates the specified Wald CI of a binomial proportion variable, given imputed data sets.
Qbar

Calculate Qbar (average response over MICE datasets)
Bm

Calculate between-imputation variance of the response mean $$\frac{\sum (\hat{Q}_l-\bar{Q})}{m-1}$$
Ubar

Calculate Ubar (average response variance over MICE datasets)
dof

Calculate degrees of freedom used in calculating confidence intervals of t-distributed proportion point estimate \(\bar{Q}_m\)
Qhats

Calculate Qhats (means of response for each imputed dataset)
mi_wald_phat

Calculates the MI-Wald interval if given a vector of observed binomial proportions (one for each imputed data frame)
reexports

Objects exported from other packages
Rm

Helper function for getting rm, a key component for calculating degrees of freedom and the wilson CI directly
mi_wilson_phat

Calculates the MI-Wilson interval if given a vector of observed binomial proportions (one for each imputed data frame)
mi_wilson

Calculates the specified Wilson CI of a binomial proportion variable, given imputed data sets.