Learn R Programming

NADIA (version 0.4.2)

autotune_VIM_Irmi: Perform imputation using VIM package and irmi function

Description

Function use IRMI (Iterative robust model-based imputation ) to impute missing data.

Usage

autotune_VIM_Irmi(
  df,
  col_type = NULL,
  percent_of_missing = NULL,
  eps = 5,
  maxit = 100,
  step = FALSE,
  robust = FALSE,
  init.method = "kNN",
  force = FALSE,
  col_0_1 = FALSE,
  out_file = NULL
)

Value

Return one data.frame with imputed values.

Arguments

df

data.frame. Df to impute with column names and without target column.

col_type

character vector. Vector containing column type names.

percent_of_missing

numeric vector. Vector contatining percent of missing data in columns for example c(0,1,0,0,11.3,..)

eps

threshold for convergency

maxit

maximum number of iterations

step

stepwise model selection is applied when the parameter is set to TRUE

robust

if TRUE, robust regression methods will be applied (it's impossible to set step=TRUE and robust=TRUE at the same time)

init.method

Method for initialization of missing values (kNN or median)

force

if TRUE, the algorithm tries to find a solution in any case, possible by using different robust methods automatically. (should be set FALSE for simulation)

col_0_1

Decaid if add bonus column informing where imputation been done. 0 - value was in dataset, 1 - value was imputed. Default False. (Works only for returning one dataset).

out_file

Output log file location if file already exists log message will be added. If NULL no log will be produced.

Author

Alexander Kowarik, Matthias Templ (2016) tools:::Rd_expr_doi("10.18637/jss.v074.i07")

Details

Function can work with various different times depending on data size and structure. In some cases when selected param wouldn't work function try to run on default. Most important param for both quality and reliability its eps.

References

Alexander Kowarik, Matthias Templ (2016). Imputation with the R Package VIM. Journal of Statistical Software, 74(7), 1-16. doi:10.18637/jss.v074.i07

Examples

Run this code
{
  raw_data <- data.frame(
    a = as.factor(sample(c("red", "yellow", "blue", NA), 1000, replace = TRUE)),
    b = as.integer(1:1000),
    c = as.factor(sample(c("YES", "NO", NA), 1000, replace = TRUE)),
    d = runif(1000, 1, 10),
    e = as.factor(sample(c("YES", "NO"), 1000, replace = TRUE)),
    f = as.factor(sample(c("male", "female", "trans", "other", NA), 1000, replace = TRUE)))

  # Prepering col_type
  col_type <- c("factor", "integer", "factor", "numeric", "factor", "factor")

  percent_of_missing <- 1:6
  for (i in percent_of_missing) {
    percent_of_missing[i] <- 100 * (sum(is.na(raw_data[, i])) / nrow(raw_data))
  }


  imp_data <- autotune_VIM_Irmi(raw_data, col_type, percent_of_missing)

  # Check if all missing value was imputed
  sum(is.na(imp_data)) == 0
  # TRUE
}

Run the code above in your browser using DataLab