h2o.svd: Singular Value Decomposition

Description

Singular value decomposition of an H2O data frame using the power method.

Usage

h2o.svd(training_frame, x, nv, destination_key, max_iterations = 1000, transform = "NONE", svd_method = c("GramSVD", "Power", "Randomized"), seed, use_all_factor_levels, max_runtime_secs = 0)

Arguments

training_frame

An H2OFrame object containing the variables in the model.

(Optional) A vector containing the data columns on which SVD operates.

The number of right singular vectors to be computed. This must be between 1 and min(ncol(training_frame), nrow(training_frame)) inclusive.

destination_key

(Optional) The unique hex key assigned to the resulting model. Automatically generated if none is provided.

max_iterations

The maximum number of iterations to run each power iteration loop. Must be between 1 and 1e6 inclusive.

transform

A character string that indicates how the training data should be transformed before running PCA. Possible values are: "NONE" for no transformation; "DEMEAN" for subtracting the mean of each column; "DESCALE" for dividing by the standard deviation of each column; "STANDARDIZE" for demeaning and descaling; and "NORMALIZE" for demeaning and dividing each column by its range (max - min).

svd_method

A character string that indicates how SVD should be calculated. Possible values are "GramSVD": distributed computation of the Gram matrix followed by a local SVD using the JAMA package, "Power": computation of the SVD using the power iteration method, "Randomized": approximate SVD by projecting onto a random subspace (see references).

seed

(Optional) Random seed used to initialize the right singular vectors at the beginning of each power method iteration.

use_all_factor_levels

(Optional) A logical value indicating whether all factor levels should be included in each categorical column expansion. If FALSE, the indicator column corresponding to the first factor level of every categorical variable will be dropped. Defaults to TRUE.

max_runtime_secs

Maximum allowed runtime in seconds for model training. Use 0 to disable.

Value

Returns an object of class H2ODimReductionModel.

References

N. Halko, P.G. Martinsson, J.A. Tropp. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions[http://arxiv.org/abs/0909.4061]. SIAM Rev., Survey and Review section, Vol. 53, num. 2, pp. 217-288, June 2011.

Examples

Run this code


library(h2o)
h2o.init()
ausPath <- system.file("extdata", "australia.csv", package="h2o")
australia.hex <- h2o.uploadFile(path = ausPath)
h2o.svd(training_frame = australia.hex, nv = 8)

Run the code above in your browser using DataLab