AutoKMeans

is the source time series data.table

data

set based on number of threads your machine has available

nthreads

set based on the amount of memory your machine has available

MaxMem

Set to "standard", "mojo", or NULL (default)

SaveModels

Set to folder where you will keep the models

PathFile

If you want to grid tune the glrm model, set to TRUE, FALSE otherwise

GridTuneGLRM

If you want to grid tuen the KMeans model, set to TRUE, FALSE otherwise

GridTuneKMeans

glrmCols

tell H2O to ignore any columns that have zero variance

IgnoreConstCols

similar to the number of factors to return from PCA

glrmFactors

set to one of "Quadratic", "Absolute", "Huber", "Poisson", "Hinge", "Logistic", "Periodic"

Loss

glrmMaxIters

choose from "Randomized","GramSVD","Power"

SVDMethod

MaxRunTimeSecs

number of factors to test out in k-means to find the optimal number

KMeansK

pick the metric to identify top model in grid tune c("totss","betweenss","withinss")

KMeansMetric

AutoKMeans adds a column to your original data with a cluster number identifier. Uses glrm (grid tune-able) and then k-means to find optimal k.

Automates and ensures high quality output for most
of your machine learning and data science tasks. The package contains
high quality functions that run at efficient speed with minimal memory
constraints for supervised learning, unsupervised learning, feature
engineering, model evaluation and interpretation, along with some
helper functions for graphing. AutoCatBoostClassifier(),
AutoCatBoostRegression(), and AutoCatBoostMultiClass() have a
dependency to the catboost package which isn't part of the CRAN
repository at the time of this writing. The link to the catboost URL
to download the package for use is in the Additional_repositories
field below, which has the installation instructions. You need to
install that package to make use of the AutoCatBoost_ functions.

AutoKMeans: AutoKMeans Automated row clustering for mixed column types

Description

Usage

Arguments

Value

See Also

Examples