Learn R Programming

miceFast (version 0.9.1)

fill_NA_N: fill_NA_N function for the multiple imputations purpose

Description

Multiple imputations to fill the missing data. Non missing independent variables are used to approximate a missing observations for a dependent variable. Quantitative models were built under Rcpp packages and the C++ library Armadillo.

Usage

fill_NA_N(
  x,
  model,
  posit_y,
  posit_x,
  w = NULL,
  logreg = FALSE,
  k = 10,
  ridge = 1e-06
)

# S3 method for data.frame fill_NA_N( x, model, posit_y, posit_x, w = NULL, logreg = FALSE, k = 10, ridge = 1e-06 )

# S3 method for data.table fill_NA_N( x, model, posit_y, posit_x, w = NULL, logreg = FALSE, k = 10, ridge = 1e-06 )

# S3 method for matrix fill_NA_N( x, model, posit_y, posit_x, w = NULL, logreg = FALSE, k = 10, ridge = 1e-06 )

Value

load imputations in a numeric/character/factor (similar to the input type) vector format

Arguments

x

a numeric matrix or data.frame/data.table (factor/character/numeric/logical) - variables

model

a character - possible options ("lm_bayes","lm_noise","pmm")

posit_y

an integer/character - a position/name of dependent variable

posit_x

an integer/character vector - positions/names of independent variables

w

a numeric vector - a weighting variable - only positive values, Default: NULL

logreg

a boolean - if dependent variable has log-normal distribution (numeric). If TRUE log-regression is evaluated and then returned exponential of results., Default: FALSE

k

an integer - a number of multiple imputations or for pmm a number of closest points from which a one random value is taken, Default:10

ridge

a numeric - a value added to diagonal elements of the x'x matrix, Default: 1e-6

Methods (by class)

  • fill_NA_N(data.frame): s3 method for data.frame

  • fill_NA_N(data.table): S3 method for data.table

  • fill_NA_N(matrix): S3 method for matrix

See Also

fill_NA VIF vignette("miceFast-intro", package = "miceFast")

Examples

Run this code
library(miceFast)
library(dplyr)
library(data.table)

data(air_miss)

# dplyr: PMM with 20 draws
air_miss %>%
  mutate(Ozone_pmm = fill_NA_N(
    x = ., model = "pmm",
    posit_y = "Ozone", posit_x = c("Solar.R", "Wind", "Temp"),
    k = 20
  ))

# dplyr: lm_noise with weights
air_miss %>%
  mutate(Ozone_imp = fill_NA_N(
    x = ., model = "lm_noise",
    posit_y = "Ozone",
    posit_x = c("Solar.R", "Wind", "Temp"),
    w = .[["weights"]],
    logreg = TRUE, k = 30
  ))

# data.table: PMM grouped
data(air_miss)
setDT(air_miss)
air_miss[, Ozone_pmm := fill_NA_N(
  x = .SD, model = "pmm",
  posit_y = "Ozone",
  posit_x = c("Wind", "Temp", "Intercept"),
  k = 20
), by = .(groups)]

# See the vignette for full examples:
# vignette("miceFast-intro", package = "miceFast")

Run the code above in your browser using DataLab