impute.NN_HD(DATA = NULL, distance = "man", weights = "range", attributes = "sim", comp = "rw_dist", donor_limit = Inf, optimal_donor = "no", list_donors_recipients = NULL, diagnose = NULL)data.frame, then factors and strings will be recoded using model.matrix or Will be coerced by data.matrix.DATA.
If the diagnose option is used correctly, a list containing the following components will be created in the workspace:
distance can be defined as:
argument: weights can be defined as:
argument: comp can be defined as:
argument: donor_limit is a single number interpreted by its range:
argument: optimal_donor is a single string interpreted by its value:
argument: diagnose should be:
NULL, no diagnostics will be returned.
.GlobalEnv. The following character strings will however default to NULL with a warning:
"if", "else", "repeat", "while", "function", "for", "in", "next", "break", "TRUE", "FALSE", "NULL", "Inf", "NaN", "NA", "NA_integer_", "NA_real_", "NA_complex_", "NA_character_", "c", "q", "s", "t", "C", "D", "F", "I", "T"
NULL with a warning.
Should be a character string of the desired variable name which will be created in .GlobalEnv
Bankhofer, U. and Joenssen, D.W. (2014) On Limiting Donor Usage for Imputation of Missing Data via Hot Deck Methods. In: M. Spiliopoulou, L. Schmidt-Thieme, and R. Jannings (Eds.): Data Analysis, Machine Learning and Knowledge Discovery. Studies in Classification, Data Analysis and Knowledge Organization, 3--11. Berlin/Heidelberg: Springer.
Domschke, W. (1995) Logistik: Transport. Munich: Oldenbourg. [in German]
Ford, B. (1983) An Overview of Hot Deck Procedures. In: W. Madow, H. Nisselson and I. Olkin (Eds.): Incomplete Data in Sample Surveys. New York: Academic Press, 185--207.
Joenssen, D.W. (2015) Donor Limited Hot Deck Imputation: A Constrained Optimization Problem. In: B. Lausen, S. Krolak-Schwerdt, and M. B\"ohmer (Eds.): Data Science, Learning by Latent Structures, and Knowledge Discovery. Studies in Classification, Data Analysis and Knowledge Organization, pages 319--328. Berlin/Heidelberg: Springer.
Joenssen, D.W. (2015) Hot-Deck-Verfahren zur Imputation fehlender Daten -- Auswirkungen des Donor-Limits. Ilmenau: Ilmedia. [in German, Dissertation]
Joenssen, D.W. and Bankhofer, U. (2012) Donor Limited Hot Deck Imputation: Effects on Parameter Estimation. Journal of Theoretical and Applied Computer Science. 6, 58--70.
Kalton, G. and Kasprzyk, D. (1986) The Treatment of Missing Survey Data. Survey Methodology. 12, 1--16.
Sande, I. (1983) Hot-Deck Imputation Procedures. In: W. Madow, H. Nisselson and I. Olkin (Eds.): Incomplete Data in Sample Surveys. New York: Academic Press, 339--349.
impute.mean, match.d_r_vam, reweight.data
#Set the random seed to an arbitrary number
set.seed(421)
#Generate random integer matrix size 10x4
Y<-matrix(sample(x=1:100,size=10*4),nrow=10)
#remove 5 values, ensuring one complete covariate and 5 donors
Y[-c(1:5),-1][sample(1:15,size=5)]<-NA
#Impute using various different (arbitrarily chosen) settings
impute.NN_HD(DATA=Y,distance="man",weights="var")
impute.NN_HD(DATA=Y,distance=2,weights=rep(.5,4),donor_limit=2,optimal_donor="mmin")
impute.NN_HD(DATA=Y,distance="eukl",weights=.25,comp="mean",donor_limit=1,
optimal_donor="odd")
#Recover some diagnostics
impute.NN_HD(DATA=Y,distance="eukl",weights=.25,comp="mean",donor_limit=1,
optimal_donor="odd",diagnose = "diagnostics")
# look at the diagnostics
diagnostics
Run the code above in your browser using DataLab