Learn R Programming

beadarrayMSV (version 1.0.3)

findPolyploidClusters: K-means clustering

Description

Wrapper for kmeans, allows samples of low presicion to be left out from the clustering and subsequently assigned to clusters

Usage

findPolyploidClusters(X, indSE = rep(TRUE, nrow(X)), centers,
    plot = FALSE, wss.update = TRUE, ...)

Arguments

X
Matrix with data for a single marker to be clustered, with three columns holding theta, intensity, and SE vectors (in that order) as from the assayData slot of an "
indSE
Logical vector of indexes to samples on which to base the clustering
centers
Numeric vector with theta starting values for the clustering
plot
If TRUE, histogram with bins encompassing the initial centre points is plotted
wss.update
The within-cluster sums of squares are returned from kmeans but not actually used in the genotype calling. If FALSE, time is saved by not recalculating the sums of squares after incl
...
Additional arguments to hist

Value

Details

Usually called from within the function callGenotypes or relatives. There the column of intensities is scaled with twice its median value times a scaling factor rPenalty (see setGenoOptions) to ensure (by default) relatively higher weight to the theta dimension during clustering. All samples left out from the clustering are subsequently incorporated into the clusters. By leaving out samples of low precision, the resulting clusters may be more accurate.

See Also

callGenotypes, getCenters, kmeans

Examples

Run this code
#Read pre-processed data directly into AlleleSetIllumina object
rPath <- system.file("extdata", package="beadarrayMSV")
normOpts <- setNormOptions()
dataFiles <- makeFilenames('testdata',normOpts,rPath)
beadFile <- paste(rPath,'beadData_testdata.txt',sep='/')
beadInfo <- read.table(beadFile,sep='\t',header=TRUE,as.is=TRUE)
BSRed <- createAlleleSetFromFiles(dataFiles[1:4],markers=1:10,beadInfo=beadInfo)

#Generate list of marker categories
gO <- setGenoOptions()
polyCent <- generatePolyCenters(ploidy=gO$ploidy)
print(polyCent)

#Estimate list of likely center points for an MSV-5 marker
ind <- 2
dev.new(); par(mfrow=c(3,1),mai=c(.5,.5,.5,.1))
polyCl <- findClusters(assayData(BSRed)$theta[ind,],
    breaks=seq(-0.25,1.25,0.04),plot=TRUE)
print(polyCl)

#Clustering using all samples
sclR <- median(assayData(BSRed)$intensity[ind,],na.rm=TRUE)*ind*gO$rPenalty
X <- matrix(cbind(assayData(BSRed)$theta[ind,],
                  assayData(BSRed)$intensity[ind,]/sclR,
                  assayData(BSRed)$SE[ind,]),ncol=3)
clObj <- findPolyploidClusters(X,centers=polyCl$clPeaks,plot=TRUE)
plot(X[,1],X[,2],col=clObj$cluster)
print(clObj)

Run the code above in your browser using DataLab