Learn Data & AI Skills | 50% off
Get 50% off unlimited learning

chickn (version 1.2.3)

hcc_parallel: hcc_parallel

Description

Compressed Hierarchical Clustering.

Usage

hcc_parallel(
  Data,
  W,
  K,
  maxLevel,
  ncores = 2,
  DIR_output = tempfile(),
  hybrid = FALSE,
  verbose = FALSE,
  ...
)

Arguments

Data

A Filebacked Big Matrix n x N. Data signals are stored in the matrix columns.

W

A frequency matrix m x n with frequency vectros in rows.

K

Number of clusters at each call of the clustering algorithm.

maxLevel

Maximum number of hierarchical levels.

ncores

Number of cores. By default 4.

DIR_output

An output directory.

hybrid

logical parameter. If TRUE K decreases progressively over hierarchical levels as Klevel. Default is FALSE.

verbose

logical that indicates whether dysplay the processing steps.

...

Additional arguments passed on to COMPR.

Value

The cluster assignment as a list of clusters with corresponding data vector indeces.

Details

This function provides a divisive hierarchical implementation of COMPR. Parallel computations are performed using 'FORK' clusters (Linux-like platform) or 'PSOCK' clusters (Windows platform) using the parallel package. This function generates in the DIR_output directory the following files:

  • 'Cluster_assign_out.bk' is a Filebacked Big Matrix N x maxLevel+1, which stores the cluster assignment at each hierarchical level.

  • 'Centroids_out.bk' is a Filebacked Big Matrix with the resulting cluster centroids in columns.

References

DBLP:journals/corr/KerivenTTG16chickn

See Also

COMPR

Examples

Run this code
# NOT RUN {
data("UPS2")
N = ncol(UPS2)
n= nrow(UPS2)
X_FBM = bigstatsr::FBM(init = UPS2, ncol=N, nrow = n)$save()
K_W1 = Nystrom_kernel(Data = X_FBM, c = 14, l = 7, s = 5, 
                      max_neighbors = 3, ncores = 1, kernel = 'Gaussian')$K_W1
W = GenerateFrequencies(Data = K_W1, m = 20, N0 = ncol(X_FBM))$W
C = hcc_parallel(Data = K_W1, W = W, K = 2, maxLevel = 4, 
                 DIR_output = tempfile(), ncores = 2)
# }

Run the code above in your browser using DataLab