The function does a PCA analysis using prcomp
function
using percent methylation matrix as an input.
PCASamples(.Object, screeplot=FALSE, adj.lim=c(0.0004,0.1), scale=TRUE,
center=TRUE,comp=c(1,2),transpose=TRUE,sd.filter=TRUE,
sd.threshold=0.5,filterByQuantile=TRUE,obj.return=FALSE,chunk.size)# S4 method for methylBase
PCASamples(.Object, screeplot, adj.lim, scale, center,
comp, transpose, sd.filter, sd.threshold, filterByQuantile, obj.return)
# S4 method for methylBaseDB
PCASamples(.Object, screeplot = FALSE,
adj.lim = c(4e-04, 0.1), scale = TRUE, center = TRUE, comp = c(1, 2),
transpose = TRUE, sd.filter = TRUE, sd.threshold = 0.5,
filterByQuantile = TRUE, obj.return = FALSE, chunk.size = 1e+06)
a methylBase
or methylBaseDB
object
a logical value indicating whether to plot the variances against the number of the principal component. (default: FALSE)
a vector indicating the propotional adjustment of xlim (adj.lim[1]) and ylim (adj.lim[2]). This is primarily used for adjusting the visibility of sample labels on the on the PCA plot. (default: c(0.0004,0.1))
logical indicating if prcomp
should scale the data to
have unit variance or not (default: TRUE)
logical indicating if prcomp
should center the data
or not (default: TRUE)
vector of integers with 2 elements specifying which components to be plotted.
if TRUE (default) percent methylation matrix will be transposed, this is equivalent to doing PCA on variables that are regions/bases. The resulting plot will location of samples in the new coordinate system if FALSE the variables for the matrix will be samples and the resulting plot whill show how each sample (variable) contributes to the principle component.the samples that are highly correlated should have similar contributions to the principal components.
If TRUE
, the bases/regions with low variation will
be discarded prior to PCA (default:TRUE)
A numeric value. If filterByQuantile
is TRUE
,
the value should be between 0 and 1 and the features whose standard
deviations is less than the quantile denoted by sd.threshold
will be removed. If filterByQuantile
is FALSE
,
then features whose standard deviations is less than the value
of sd.threshold
will be removed.(default:0.5)
A logical determining if sd.threshold
is to be
interpreted as a quantile of all standard deviation values from
bases/regions (the default), or as an absolute value
if the result of prcomp
function should be returned
or not. (Default:FALSE)
Number of rows to be taken as a chunk for processing the
methylRawListDB
objects, default: 1e6
The form of the value returned by PCASamples
is the summary
of principal component analysis by prcomp
.
The parameter chunk.size
is only used when working with
methylBaseDB
objects,
as they are read in chunk by chunk to enable processing large-sized
objects which are stored as flat file database.
Per default the chunk.size is set to 1M rows, which should work for most
systems. If you encounter memory problems or
have a high amount of memory available feel free to adjust the
chunk.size
.
# NOT RUN {
data(methylKit)
# do PCA with filtering rows with low variation, filter rows with standard
# deviation lower than the 50th percentile of Standard deviation distribution
PCASamples(methylBase.obj,screeplot=FALSE, adj.lim=c(0.0004,0.1),
scale=TRUE,center=TRUE,comp=c(1,2),transpose=TRUE,sd.filter=TRUE,
sd.threshold=0.5,filterByQuantile=TRUE,obj.return=FALSE)
# }
Run the code above in your browser using DataLab