Learn R Programming

chickn (version 1.2.3)

hcc_parallel: hcc_parallel

Description

Compressed Hierarchical Clustering.

Usage

hcc_parallel(
  Data,
  W,
  K,
  maxLevel,
  ncores = 2,
  DIR_output = tempfile(),
  hybrid = FALSE,
  verbose = FALSE,
  ...
)

Arguments

Data

A Filebacked Big Matrix n x N. Data signals are stored in the matrix columns.

W

A frequency matrix m x n with frequency vectros in rows.

K

Number of clusters at each call of the clustering algorithm.

maxLevel

Maximum number of hierarchical levels.

ncores

Number of cores. By default 4.

DIR_output

An output directory.

hybrid

logical parameter. If TRUE K decreases progressively over hierarchical levels as \(\lceil \frac{K}{level} \rceil\). Default is FALSE.

verbose

logical that indicates whether dysplay the processing steps.

...

Additional arguments passed on to COMPR.

Value

The cluster assignment as a list of clusters with corresponding data vector indeces.

Details

This function provides a divisive hierarchical implementation of COMPR. Parallel computations are performed using 'FORK' clusters (Linux-like platform) or 'PSOCK' clusters (Windows platform) using the parallel package. This function generates in the DIR_output directory the following files:

  • 'Cluster_assign_out.bk' is a Filebacked Big Matrix N x maxLevel+1, which stores the cluster assignment at each hierarchical level.

  • 'Centroids_out.bk' is a Filebacked Big Matrix with the resulting cluster centroids in columns.

References

DBLP:journals/corr/KerivenTTG16chickn

See Also

COMPR

Examples

Run this code
# NOT RUN {
data("UPS2")
N = ncol(UPS2)
n= nrow(UPS2)
X_FBM = bigstatsr::FBM(init = UPS2, ncol=N, nrow = n)$save()
K_W1 = Nystrom_kernel(Data = X_FBM, c = 14, l = 7, s = 5, 
                      max_neighbors = 3, ncores = 1, kernel = 'Gaussian')$K_W1
W = GenerateFrequencies(Data = K_W1, m = 20, N0 = ncol(X_FBM))$W
C = hcc_parallel(Data = K_W1, W = W, K = 2, maxLevel = 4, 
                 DIR_output = tempfile(), ncores = 2)
# }

Run the code above in your browser using DataLab