random_param_mice_search: Performing randomSearch for selecting the best method and correlation or fraction of features used to create a prediction matrix.

Description

This function perform random search and return values corresponding to best mean MIF (missing information fraction). Function is mainly used in autotune_mice but can be use separately.

Usage

random_param_mice_search(
  low_corr = 0,
  up_corr = 1,
  methods_random = c("pmm"),
  df,
  formula,
  no_numeric,
  iter,
  random.seed = 123,
  correlation = TRUE
)

Value

List with best correlation (or fraction ) at first place, best method at second, and results of every iteration at 3.

Arguments

low_corr: double between 0,1 default 0 lower boundry of correlation set.
up_corr: double between 0,1 default 1 upper boundary of correlation set. Both of these parameters work the same for a fraction of features.
methods_random: set of methods to chose. Default 'pmm'.
df: data frame to input.
formula: first product of formula_creating() funtion. For example formula_creating(...)[1]
no_numeric: second product of formula_creating() function.
iter: number of iteration for randomSearch.
random.seed: radnom seed.
correlation: If True correlation is using if Fales fraction of features. Default True.

Details

Function use Random Search Technik to found the best param for mice imputation. To evaluate the next iteration logistic regression or linear regression (depending on available features) are used. Model is build using a formula from formula_creating function. As metric MIF (missing information fraction) is used. Params combination with lowest (best) MIF is chosen. Even if a correlation is set at False correlation it's still used to select the best features. That main problem with calculating correlation between categorical columns is still important.