Learn R Programming

RemixAutoML (version 0.4.2)

ModelDataPrep: Final Data Preparation Function

Description

This function replaces inf values with NA, converts characters to factors, and imputes with constants

Usage

ModelDataPrep(
  data,
  Impute = TRUE,
  CharToFactor = TRUE,
  FactorToChar = FALSE,
  IntToNumeric = TRUE,
  LogicalToBinary = FALSE,
  DateToChar = FALSE,
  RemoveDates = FALSE,
  MissFactor = "0",
  MissNum = -1,
  IgnoreCols = NULL
)

Arguments

data

This is your source data you'd like to modify

Impute

Defaults to TRUE which tells the function to impute the data

CharToFactor

Defaults to TRUE which tells the function to convert characters to factors

FactorToChar

Converts to character

IntToNumeric

Defaults to TRUE which tells the function to convert integers to numeric

LogicalToBinary

Converts logical values to binary numeric values

DateToChar

Converts date columns into character columns

RemoveDates

Defaults to FALSE. Set to TRUE to remove date columns from your data.table

MissFactor

Supply the value to impute missing factor levels

MissNum

Supply the value to impute missing numeric values

IgnoreCols

Supply column numbers for columns you want the function to ignore

Value

Returns the original data table with corrected values

See Also

Other Feature Engineering: AutoDataPartition(), AutoHierarchicalFourier(), AutoInteraction(), AutoLagRollStatsScoring(), AutoLagRollStats(), AutoTransformationCreate(), AutoTransformationScore(), AutoWord2VecModeler(), AutoWord2VecScoring(), ContinuousTimeDataGenerator(), CreateCalendarVariables(), CreateHolidayVariables(), DT_GDL_Feature_Engineering(), DifferenceDataReverse(), DifferenceData(), DummifyDT(), H2oAutoencoder(), Partial_DT_GDL_Feature_Engineering(), TimeSeriesFill()

Examples

Run this code
# NOT RUN {
# Create fake data
data <- RemixAutoML::FakeDataGenerator(
  Correlation = 0.75,
  N = 250000L,
  ID = 2L,
  ZIP = 0L,
  FactorCount = 6L,
  AddDate = TRUE,
  Classification = FALSE,
  MultiClass = FALSE)

# Check column types
str(data)

# Convert some factors to character
data <- RemixAutoML::ModelDataPrep(
  data,
  Impute       = TRUE,
  CharToFactor = FALSE,
  FactorToChar = TRUE,
  IntToNumeric = TRUE,
  LogicalToBinary = FALSE,
  DateToChar   = FALSE,
  RemoveDates  = TRUE,
  MissFactor   = "0",
  MissNum      = -1,
  IgnoreCols   = c("Factor_1"))

# Check column types
str(data)
# }

Run the code above in your browser using DataLab