fill_NA: `fill_NA` function for the imputations purpose.

Description

Regular imputations to fill the missing data. Non missing independent variables are used to approximate a missing observations for a dependent variable. Quantitative models were built under Rcpp packages and the C++ library Armadillo.

Usage

fill_NA(x, model, posit_y, posit_x, w = NULL, logreg = FALSE, ridge = 1e-06)
# S3 method for data.frame
fill_NA(x, model, posit_y, posit_x, w = NULL, logreg = FALSE, ridge = 1e-06)
# S3 method for data.table
fill_NA(x, model, posit_y, posit_x, w = NULL, logreg = FALSE, ridge = 1e-06)
# S3 method for matrix
fill_NA(x, model, posit_y, posit_x, w = NULL, logreg = FALSE, ridge = 1e-06)

Value

load imputations in a numeric/logical/character/factor (similar to the input type) vector format

Arguments

x: a numeric matrix or data.frame/data.table (factor/character/numeric/logical) - variables
model: a character - possible options ("lda","lm_pred","lm_bayes","lm_noise")
posit_y: an integer/character - a position/name of dependent variable
posit_x: an integer/character vector - positions/names of independent variables
w: a numeric vector - a weighting variable - only positive values, Default:NULL
logreg: a boolean - if dependent variable has log-normal distribution (numeric). If TRUE log-regression is evaluated and then returned exponential of results., Default: FALSE
ridge: a numeric - a value added to diagonal elements of the x'x matrix, Default: 1e-6

Methods (by class)

fill_NA(data.frame): S3 method for data.frame
fill_NA(data.table): s3 method for data.table
fill_NA(matrix): S3 method for matrix

Examples

Run this code

library(miceFast)
library(dplyr)
library(data.table)

data(air_miss)

# dplyr: continuous variable with Bayesian linear model
air_miss %>%
  mutate(Ozone_imp = fill_NA(
    x = ., model = "lm_bayes",
    posit_y = "Ozone", posit_x = c("Solar.R", "Wind", "Temp")
  ))

# dplyr: categorical variable with LDA
air_miss %>%
  mutate(x_char_imp = fill_NA(
    x = ., model = "lda",
    posit_y = "x_character", posit_x = c("Wind", "Temp")
  ))

# dplyr: grouped imputation with weights
air_miss %>%
  group_by(groups) %>%
  do(mutate(., Solar_R_imp = fill_NA(
    x = ., model = "lm_pred",
    posit_y = "Solar.R",
    posit_x = c("Wind", "Temp", "Intercept"),
    w = .[["weights"]]
  ))) %>%
  ungroup()

# data.table
data(air_miss)
setDT(air_miss)
air_miss[, Ozone_imp := fill_NA(
  x = .SD, model = "lm_bayes",
  posit_y = "Ozone", posit_x = c("Solar.R", "Wind", "Temp")
)]

# data.table: grouped
air_miss[, Solar_R_imp := fill_NA(
  x = .SD, model = "lm_pred",
  posit_y = "Solar.R",
  posit_x = c("Wind", "Temp", "Intercept"),
  w = .SD[["weights"]]
), by = .(groups)]

# See the vignette for full examples:
# vignette("miceFast-intro", package = "miceFast")