Imputes (fills gaps) of missing standard deviations (SD) using simple imputation methods following Bracken (1992) and Rubin and Schenker's (1991) "hot deck" approach.
impute_SD(
aDataFrame,
columnSDnames,
columnXnames,
method = "Bracken1992",
range = 3,
M = 1
)
A data frame containing columns with missing SD's (coded as
NA
) and their complete means (used only for nearest-neighbor method).
Label of the column(s) with missing SD. Can be a string or list of strings.
Label of the column(s) with means (X) for each SD. Can be a string or list of strings. Must be complete with no missing data.
The method used to impute the missing SD's. The default is
"Bracken1992"
which applies Bracken's (1992) approach to impute SD using
the coefficient of variation from all complete cases. Other options include:
"HotDeck"
which applies Rubin and Schenker's (1991) resampling approach to
fill gaps of missing SD from the SD's with complete information, and
"HotDeck_NN"
which resamples from complete cases with means that are similar
to missing SD's.
A positive number on the range of neighbours to sample from for
imputing SD's. Used in combination with "HotDeck_NN"
. The default
is 3; which indicates that the 3 means that are most similar in rank order
to the mean with the missing SD will be resampled.
The number of imputed datasets to return. Currently only works
for "HotDeck"
method.
An imputed (complete) dataset.
Bracken, M.B. 1992. Statistical methods for analysis of effects of treatment in overviews of randomized trials. Effective care of the newborn infant (eds J.C. Sinclair and M.B. Bracken), pp. 13-20. Oxford University Press, Oxford.
Rubin, D.B. and Schenker, N. 1991. Multiple imputation in health-care databases: an overview and some applications. Statistics in Medicine 10: 585-598.