impute.NN_HD(DATA = NULL, distance = "man", weights = "range", attributes = "sim", comp = "rw_dist", donor_limit = Inf, optimal_donor = "no", list_donors_recipients = NULL, diagnose = NULL)
data.frame
, then factors and strings will be recoded using model.matrix
or Will be coerced by data.matrix
.DATA
.
If the diagnose
option is used correctly, a list containing the following components will be created in the workspace:
distance
can be defined as:
argument: weights
can be defined as:
argument: comp
can be defined as:
argument: donor_limit
is a single number interpreted by its range:
argument: optimal_donor
is a single string interpreted by its value:
argument: diagnose
should be:
NULL
, no diagnostics will be returned.
.GlobalEnv
. The following character strings will however default to NULL
with a warning:
"if", "else", "repeat", "while", "function", "for", "in", "next", "break", "TRUE", "FALSE", "NULL", "Inf", "NaN", "NA", "NA_integer_", "NA_real_", "NA_complex_", "NA_character_", "c", "q", "s", "t", "C", "D", "F", "I", "T"
NULL
with a warning.
Should be a character string of the desired variable name which will be created in .GlobalEnv
Bankhofer, U. and Joenssen, D.W. (2014) On Limiting Donor Usage for Imputation of Missing Data via Hot Deck Methods. In: M. Spiliopoulou, L. Schmidt-Thieme, and R. Jannings (Eds.): Data Analysis, Machine Learning and Knowledge Discovery. Studies in Classification, Data Analysis and Knowledge Organization, 3--11. Berlin/Heidelberg: Springer.
Domschke, W. (1995) Logistik: Transport. Munich: Oldenbourg. [in German]
Ford, B. (1983) An Overview of Hot Deck Procedures. In: W. Madow, H. Nisselson and I. Olkin (Eds.): Incomplete Data in Sample Surveys. New York: Academic Press, 185--207.
Joenssen, D.W. (2015) Donor Limited Hot Deck Imputation: A Constrained Optimization Problem. In: B. Lausen, S. Krolak-Schwerdt, and M. B\"ohmer (Eds.): Data Science, Learning by Latent Structures, and Knowledge Discovery. Studies in Classification, Data Analysis and Knowledge Organization, pages 319--328. Berlin/Heidelberg: Springer.
Joenssen, D.W. (2015) Hot-Deck-Verfahren zur Imputation fehlender Daten -- Auswirkungen des Donor-Limits. Ilmenau: Ilmedia. [in German, Dissertation]
Joenssen, D.W. and Bankhofer, U. (2012) Donor Limited Hot Deck Imputation: Effects on Parameter Estimation. Journal of Theoretical and Applied Computer Science. 6, 58--70.
Kalton, G. and Kasprzyk, D. (1986) The Treatment of Missing Survey Data. Survey Methodology. 12, 1--16.
Sande, I. (1983) Hot-Deck Imputation Procedures. In: W. Madow, H. Nisselson and I. Olkin (Eds.): Incomplete Data in Sample Surveys. New York: Academic Press, 339--349.
impute.mean
, match.d_r_vam
, reweight.data
#Set the random seed to an arbitrary number
set.seed(421)
#Generate random integer matrix size 10x4
Y<-matrix(sample(x=1:100,size=10*4),nrow=10)
#remove 5 values, ensuring one complete covariate and 5 donors
Y[-c(1:5),-1][sample(1:15,size=5)]<-NA
#Impute using various different (arbitrarily chosen) settings
impute.NN_HD(DATA=Y,distance="man",weights="var")
impute.NN_HD(DATA=Y,distance=2,weights=rep(.5,4),donor_limit=2,optimal_donor="mmin")
impute.NN_HD(DATA=Y,distance="eukl",weights=.25,comp="mean",donor_limit=1,
optimal_donor="odd")
#Recover some diagnostics
impute.NN_HD(DATA=Y,distance="eukl",weights=.25,comp="mean",donor_limit=1,
optimal_donor="odd",diagnose = "diagnostics")
# look at the diagnostics
diagnostics
Run the code above in your browser using DataLab