
Last chance! 50% off unlimited learning
Sale ends in
This function replaces inf values with NA, converts characters to factors, and imputes with constants
ModelDataPrep(
data,
Impute = TRUE,
CharToFactor = TRUE,
FactorToChar = FALSE,
IntToNumeric = TRUE,
LogicalToBinary = FALSE,
DateToChar = FALSE,
IDateConversion = FALSE,
RemoveDates = FALSE,
MissFactor = "0",
MissNum = -1,
IgnoreCols = NULL
)
This is your source data you'd like to modify
Defaults to TRUE which tells the function to impute the data
Defaults to TRUE which tells the function to convert characters to factors
Converts to character
Defaults to TRUE which tells the function to convert integers to numeric
Converts logical values to binary numeric values
Converts date columns into character columns
Convert IDateTime to POSIXct and IDate to Date types
Defaults to FALSE. Set to TRUE to remove date columns from your data.table
Supply the value to impute missing factor levels
Supply the value to impute missing numeric values
Supply column numbers for columns you want the function to ignore
Returns the original data table with corrected values
Other Feature Engineering:
AutoDataPartition()
,
AutoDiffLagN()
,
AutoHierarchicalFourier()
,
AutoInteraction()
,
AutoLagRollStatsScoring()
,
AutoLagRollStats()
,
AutoTransformationCreate()
,
AutoTransformationScore()
,
AutoWord2VecModeler()
,
AutoWord2VecScoring()
,
ContinuousTimeDataGenerator()
,
CreateCalendarVariables()
,
CreateHolidayVariables()
,
DT_GDL_Feature_Engineering()
,
DifferenceDataReverse()
,
DifferenceData()
,
DummifyDT()
,
H2OAutoencoderScoring()
,
H2OAutoencoder()
,
Partial_DT_GDL_Feature_Engineering()
,
TimeSeriesFill()
# NOT RUN {
# Create fake data
data <- RemixAutoML::FakeDataGenerator(
Correlation = 0.75,
N = 250000L,
ID = 2L,
ZIP = 0L,
FactorCount = 6L,
AddDate = TRUE,
Classification = FALSE,
MultiClass = FALSE)
# Check column types
str(data)
# Convert some factors to character
data <- RemixAutoML::ModelDataPrep(
data,
Impute = TRUE,
CharToFactor = FALSE,
FactorToChar = TRUE,
IntToNumeric = TRUE,
LogicalToBinary = FALSE,
DateToChar = FALSE,
IDateConversion = FALSE,
RemoveDates = TRUE,
MissFactor = "0",
MissNum = -1,
IgnoreCols = c("Factor_1"))
# Check column types
str(data)
# }
Run the code above in your browser using DataLab