Learn R Programming

imputation (version 2.0.3)

gbmImpute: GBM Imputation

Description

Imputation using Boosted Trees Fill each column by treating it as a regression problem. For each column i, use boosted regression trees to predict i using all other columns except i. If the predictor variables also contain missing data, the gbm function will itself use surrogate variables as substitutes for the predictors. This imputation function can handle both categorical and numeric data.

Usage

gbmImpute(x, max.iters = 2, cv.fold = 2, n.trees = 100, verbose = T, ...)

Arguments

x
a data frame or matrix where each row is a different record
max.iters
number of times to iterate through the columns and impute each column with fitted values from a regression tree
cv.fold
number of folds that gbm should use internally for cross validation
n.trees
the number of trees used in gradient boosting machines
verbose
if TRUE print status updates
...
additional params passed to gbm

Examples

Run this code
x = matrix(rnorm(10000),1000,10)
  x.missing = x > 2
  x[x.missing] = NA
  gbmImpute(x)

Run the code above in your browser using DataLab