Learn R Programming

hiReadsProcessor (version 1.8.2)

clusterSites: Cluster/Correct values within a window based on their frequency given discrete factors

Description

Given a group of discrete factors (i.e. position ids) and integer values, the function tries to correct/cluster the integer values based on their frequency in a defined windowsize.

Usage

clusterSites(posID = NULL, value = NULL, grouping = NULL, psl.rd = NULL,
  weight = NULL, windowSize = 5L, byQuartile = FALSE, quartile = 0.7,
  parallel = TRUE, sonicAbund = FALSE)

Arguments

posID
a vector of groupings for the value parameter (i.e. Chr,strand). Required if psl.rd parameter is not defined.
value
a vector of integer with values that needs to corrected or clustered (i.e. Positions). Required if psl.rd parameter is not defined.
grouping
additional vector of grouping of length posID or psl.rd by which to pool the rows (i.e. samplenames). Default is NULL.
psl.rd
a GRanges object returned from getIntegrationSites. Default is NULL.
weight
a numeric vector of weights to use when calculating frequency of value by posID and grouping if specified. Default is NULL.
windowSize
size of window within which values should be corrected or clustered. Default is 5.
byQuartile
flag denoting whether quartile based technique should be employed. See notes for details. Default is TRUE.
quartile
if byQuartile=TRUE, then the quartile which serves as the threshold. Default is 0.70.
parallel
use parallel backend to perform calculation with BiocParallel. Defaults to TRUE. If no parallel backend is registered, then a serial version is ran using SerialParam. Process is split by the grouping the column.
sonicAbund
calculate breakpoint abundance using getSonicAbund. Default is FALSE.

Value

  • a data frame with clusteredValues and frequency shown alongside with the original input. If psl.rd parameter is defined then a GRanges object is returned with three new columns appended at the end: clusteredPosition, clonecount, and clusterTopHit (a representative for a given cluster chosen by best scoring hit!).

See Also

findIntegrations, getIntegrationSites, otuSites, isuSites, crossOverCheck, pslToRangedObject, getSonicAbund

Examples

Run this code
clusterSites(posID=c('chr1-','chr1-','chr1-','chr2+','chr15-',
'chr16-','chr11-'), value=c(rep(1000,2),5832,1000,12324,65738,928042),
grouping=c('a','a','a','b','b','b','c'))
data(psl)
psl <- psl[sample(nrow(psl),100),]
psl.rd <- getIntegrationSites(pslToRangedObject(psl))
psl.rd$grouping <- sub("(.+)-.+","\\1",psl.rd$qName)
clusterSites(grouping=psl.rd$grouping, psl.rd=psl.rd)

Run the code above in your browser using DataLab