Learn R Programming

contiBAIT (version 1.0.0)

clusterContigs,StrandStateMatrix-method: clusterContigs -- agglomeratively clusters contigs into linkage groups based on strand inheritance

Description

clusterContigs -- agglomeratively clusters contigs into linkage groups based on strand inheritance

Usage

## S3 method for class 'StrandStateMatrix':
clusterContigs(object, similarityCutoff = 0.7,
  recluster = NULL, minimumLibraryOverlap = 5, randomise = TRUE,
  randomSeed = NULL, randomWeight = NULL, clusterParam = NULL,
  clusterBy = "hetero", verbose = TRUE)

Arguments

object
data.frame containing strand inheritance information for every contig (rows) in every library (columns). This should be the product of strandSeqFreqTable
similarityCutoff
place contigs in a cluster when their strand state is at least this similar
recluster
Number of times to recluster and take the consensus of. If NULL, clustering is run only once.
minimumLibraryOverlap
for two contigs to be clustered together, the strand inheritance must be present for both contigs in at least this many libraries (in addition to their similarity being at least similarityCutoff)
randomise
whether to reorder contigs before clustering
randomSeed
random seed to initialize clustering
randomWeight
vector of weights for contigs for resampling. If NULL, uniform resampling is used. Typically this should be a measure of contig quality, such as library coverage, so that clustering tends to start from the better quality contigs.
clusterParam
optional BiocParallelParam specifying cluster to use for parallel execution. When NULL, execution will be serial.
clusterBy
Method for performing clustering. Default is 'hetero' (for comparing heterozygous calls to homozygous). Alternative is 'homo' (for compairson between the two homozygous calls)
verbose
prints function progress

Value

  • LinkageGroupList of vectors containing labels of contigs belonging to each linkage group

Details

Note that a more stringent similarity cutoff will result in more clusters, and a longer run time, since at every iteration a distance is computed to the existing clusters. However, in lower-quality data, a more stringent cutoff may be necessary to reduce the number of contigs that are erroneously grouped.

Note that clusterParam

Examples

Run this code
data("exampleWCMatrix")
 
clusteredContigs <- clusterContigs(exampleWCMatrix, verbose=FALSE)
show(clusteredContigs)
show(clusteredContigs[[1]])

reorientedMatrix <- reorientLinkageGroups(clusteredContigs, exampleWCMatrix)
mergedLinkageGroups <- mergeLinkageGroups(clusteredContigs,reorientedMatrix[[1]])

Run the code above in your browser using DataLab