Learn R Programming

RemixAutoML (version 0.11.0)

AutoTransformationCreate: AutoTransformationCreate is a function for automatically identifying the optimal transformations for numeric features and transforming them once identified.

Description

AutoTransformationCreate is a function for automatically identifying the optimal transformations for numeric features and transforming them once identified. This function will loop through your selected transformation options (YeoJohnson, BoxCox, Asinh, Asin, and Logit) and find the one that produces data that is the closest to normally distributed data. It then makes the transformation and collects the metadata information for use in the AutoTransformationScore() function, either by returning the objects (always) or saving them to file (optional).

Usage

AutoTransformationCreate(data, ColumnNames = NULL,
  Methods = c("BoxCox", "YeoJohnson", "Asinh", "Log", "LogPlus1", "Asin",
  "Logit", "Identity"), Path = NULL, TransID = "ModelID",
  SaveOutput = FALSE)

Arguments

data

This is your source data

ColumnNames

List your columns names in a vector, for example, c("Target", "IV1")

Methods

Choose from "YeoJohnson", "BoxCox", "Asinh", "Log", "LogPlus1", "Asin", "Logit", and "Identity".

Path

Set to the directly where you want to save all of your modeling files

TransID

Set to a character value that corresponds with your modeling project

SaveOutput

Set to TRUE to save necessary file to run AutoTransformationScore()

Value

data with transformed columns and the transformation object for back-transforming later

See Also

Other Feature Engineering: AutoDataPartition, AutoTransformationScore, AutoWord2VecModeler, CreateCalendarVariables, CreateHolidayVariables, DT_GDL_Feature_Engineering, DummifyDT, GDL_Feature_Engineering, ModelDataPrep, Partial_DT_GDL_Feature_Engineering, Scoring_GDL_Feature_Engineering, TimeSeriesFill

Examples

Run this code
# NOT RUN {
Correl <- 0.85
N <- 1000
data <- data.table::data.table(Adrian = runif(N))
data[, x1 := qnorm(Adrian)]
data[, x2 := runif(N)]
data[, Adrian1 := log(pnorm(Correl * x1 +
                            sqrt(1-Correl^2) * qnorm(x2)))]
data <- AutoTransformationCreate(data,
                                 ColumnNames = "Sample",
                                 Methods = c("BoxCox",
                                             "YeoJohnson",
                                             "Asinh",
                                             "Log",
                                             "LogPlus1",
                                             "Asin",
                                             "Logit",
                                             "Identity"),
                                 Path = NULL,
                                 TransID = "Trans",
                                 SaveOutput = FALSE)
# }

Run the code above in your browser using DataLab