Dpca

Performs distributed PCA on a data matrix partitioned into subsets.

We consider optimal subset selection in the setting that one needs to use only one data subset to represent the whole data set with minimum information loss, and devise a novel intersection-based criterion on selecting optimal subset, called as the FPC criterion, to handle with the optimal sub-estimator in distributed principal component analysis; That is, the FPCdpca. The philosophy of the package is described in Guo G. (2025) <doi:10.1016/j.physa.2024.130308>.

Guangbao Guo

FPCdpca

The FPCdpca Criterion on Distributed Principal Component
Analysis

Jiarui Li

Dpca function

<dl><dt>data</dt>
<dd>A numeric matrix or data frame containing the data, where rows are observations and columns are variables.</dd>
<dt>K</dt>
<dd>Integer, the number of subsets to partition the data into.</dd>
<dt>nk</dt>
<dd>Integer, the size of each subset (number of rows per subset).</dd></dl>

Arguments

Distributed Principal Component Analysis (DPCA) — Dpca

<dl>

<dt>data</dt>
<dd>A numeric matrix or data frame containing the data, where rows are observations and columns are variables.</dd>


<dt>K</dt>
<dd>Integer, the number of subsets to partition the data into.</dd>


<dt>nk</dt>
<dd>Integer, the size of each subset (number of rows per subset).</dd>

</dl>

Dpca: Distributed Principal Component Analysis (DPCA)

Description

Usage

Value

Arguments

Details

Examples