Principal Component Analysis (PCA)
h2o4gpu.pca(n_components = 2L, copy = TRUE, whiten = FALSE,
svd_solver = "arpack", tol = 0, iterated_power = "auto",
random_state = NULL, verbose = FALSE, backend = "h2o4gpu",
gpu_id = 0L)
Desired dimensionality of output data
If FALSE, data passed to fit are overwritten and running fit(X).transform(X) will not yield the expected results, use fit_transform(X) instead.
When TRUE (FALSE by default) the components_
vectors are multiplied by the square root of (n_samples) and divided by the singular values to ensure uncorrelated outputs with unit component-wise variances.
'auto' is selected by a default policy based on X.shape
and n_components
: if the input data is larger than 500x500 and the number of components to extract is lower than 80 percent of the smallest dimension of the data, then the more efficient 'randomized' method is enabled. Otherwise the exact full SVD is computed and optionally truncated afterwards. 'full' runs exact full SVD calling the standard LAPACK solver via scipy.linalg.svd
and select the components by postprocessing 'arpack'runs SVD truncated to n_components calling ARPACK solver via scipy.sparse.linalg.svds
. It requires strictly 0 < n_components < columns. 'randomized' runs randomized SVD by the method of Halko et al.
Tolerance for singular values computed by svd_solver == 'arpack'.
Number of iterations for the power method computed by svd_solver == 'randomized'.
If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If NULL, the random number generator is the RandomState instance used by np.random
. Used when svd_solver
== 'arpack' or 'randomized'.
Verbose or not
Which backend to use. Options are 'auto', 'sklearn', 'h2o4gpu'. Saves as attribute for actual backend used.
ID of the GPU on which the algorithm should run. Only used by h2o4gpu backend.