h2o4gpu (version 0.2.0)

h2o4gpu.pca: Principal Component Analysis (PCA)

Description

Principal Component Analysis (PCA)

Usage

h2o4gpu.pca(n_components = 2L, copy = TRUE, whiten = FALSE,
  svd_solver = "arpack", tol = 0, iterated_power = "auto",
  random_state = NULL, verbose = FALSE, backend = "h2o4gpu",
  gpu_id = 0L)

Arguments

n_components

Desired dimensionality of output data

copy

If FALSE, data passed to fit are overwritten and running fit(X).transform(X) will not yield the expected results, use fit_transform(X) instead.

whiten

When TRUE (FALSE by default) the components_ vectors are multiplied by the square root of (n_samples) and divided by the singular values to ensure uncorrelated outputs with unit component-wise variances.

svd_solver

'auto' is selected by a default policy based on X.shape and n_components: if the input data is larger than 500x500 and the number of components to extract is lower than 80 percent of the smallest dimension of the data, then the more efficient 'randomized' method is enabled. Otherwise the exact full SVD is computed and optionally truncated afterwards. 'full' runs exact full SVD calling the standard LAPACK solver via scipy.linalg.svd and select the components by postprocessing 'arpack'runs SVD truncated to n_components calling ARPACK solver via scipy.sparse.linalg.svds. It requires strictly 0 < n_components < columns. 'randomized' runs randomized SVD by the method of Halko et al.

tol

Tolerance for singular values computed by svd_solver == 'arpack'.

iterated_power

Number of iterations for the power method computed by svd_solver == 'randomized'.

random_state

If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If NULL, the random number generator is the RandomState instance used by np.random. Used when svd_solver == 'arpack' or 'randomized'.

verbose

Verbose or not

backend

Which backend to use. Options are 'auto', 'sklearn', 'h2o4gpu'. Saves as attribute for actual backend used.

gpu_id

ID of the GPU on which the algorithm should run. Only used by h2o4gpu backend.