Learn R Programming

handwriter (version 3.2.4)

get_clusters_batch: get_clusters_batch

Description

get_clusters_batch

Usage

get_clusters_batch(
  template,
  input_dir,
  output_dir,
  writer_indices = NULL,
  doc_indices = NULL,
  num_cores = 1,
  save_master_file = FALSE
)

Value

A list of cluster assignments

Arguments

template

A cluster template created with make_clustering_template

input_dir

A directory containing graphs created with process_batch_dir

output_dir

Output directory for cluster assignments

writer_indices

Optional. A Vector of start and end indices for the writer id in the graph file names.

doc_indices

Optional. Vector of start and end indices for the document id in the graph file names.

num_cores

Integer number of cores to use for parallel processing

save_master_file

TRUE or FALSE. If TRUE, a master file named 'all_clusters.rds' containing the cluster assignments for all documents in the input directory will be saved to the output directory. If FASLE, a master file will not be saved, but the individual files for each document in the input directory will still be saved to the output directory.

Examples

Run this code
if (FALSE) {
template <- readRDS('path/to/template.rds')
get_clusters_batch(template=template, input_dir='path/to/dir', output_dir='path/to/dir',
writer_indices=c(2,5), doc_indices=c(7,18), num_cores=1)

get_clusters_batch(template=template, input_dir='path/to/dir', output_dir='path/to/dir',
writer_indices=c(1,4), doc_indices=c(5,10), num_cores=5)
}

Run the code above in your browser using DataLab