ClusteredSample-class: ClusteredSample: A class representing a clustered FC Sample

Description

An object of class "ClusteredSample" represents a partitioning of a sample into clusters. We model a flow cytometry sample with a mixture of cell populations where a cell population is a normally distributed cluster. An object of class "ClusteredSample" therefore stores a list of clusters and other necessary parameters.

Arguments

Creating Object

An object of class "ClusteredSample" can be created using the following constructor ClusteredSample(labels, centers=list(), covs=list(), sample=NULL, sample.id=NA_integer_)

labels A vector of integers (from 1:num.clusters) indicating the cluster to which each point is allocated. This is usually obtained from a clustering algorithm.
centers A list of length num.clusters storing the centers of the clusters. The ith entry of the list centers[[i]] stores the center of the ith cluster. If not specified, the constructor estimates centers from sample.
covs A list of length num.clusters storing the covariance matrices of the clusters. The ith entry of the list cov[[i]] stores the covariance matrix of the ith cluster. If not specified, the constructor estimates cov from sample.
sample A matrix, data frame of observations, or object of class flowFrame. Rows correspond to observations and columns correspond to variables. It must be passed to the constructor if either centers or cov is unspecified; then centers or cov is estimated from sample.
sample.id The index of the sample (relative to other samples of a cohort).

Slots

num.clusters: The number of clusters in the sample.
labels: A vector of integers (from range 1:num.clusters) indicating the cluster to which each point is assigned to. For example, labels[i]=j means that the ith element (cell) is assigned to the jth cluster.
dimension: Dimensionality of the sample (number of columns in data matrix).
clusters: A list of length num.clusters storing the cell populations. Each cluster is stored as an object of class Cluster.
size: Number of cells in the sample (summation of all cluster sizes).
sample.id: integer, denoting the index of the sample (relative to other samples of a cohort). Default is NA_integer_

Accessors

get.size:: Returns the number of cells in the sample (summation of all cluster sizes). Usage: get.size(object) here object is a ClusteredSample object.
get.num.clusters: Returns the number of clusters in the sample.
get.labels: Returns the cluster labels for each cell. For example, labels[i]=j means that the ith element (cell) is assigned to the jth cluster.
get.dimension: Returns the dimensionality of the sample (number of columns in data matrix).
get.clusters: Returns the list of clusters in this sample. Each cluster is stored as an object of class Cluster.
get.sample.id: Returns the index of the sample (relative to other samples of a cohort).

Methods

show

Display details about the ClusteredSample object.

summary

Return descriptive summary for the ClusteredSample object. Usage: summary(ClusteredSample)

plot

We plot a sample by bivariate scatter plots where different clusters are shown in different colors. Usage: plot(sample, ClusteredSample, ...) the arguments of the plot function are:

sample: A matrix, data.frame or an object of class flowFrame representing an FC sample.
ClusteredSample: An object of class ClusteredSample storing the clustering of the sample.
... Other usual plotting related parameters.

Examples

Run this code

## ------------------------------------------------
## load data and retrieve a sample
## ------------------------------------------------

library(healthyFlowData)
data(hd)
sample = exprs(hd.flowSet[[1]])

## ------------------------------------------------
## cluster sample using kmeans algorithm
## ------------------------------------------------
km = kmeans(sample, centers=4, nstart=20)
cluster.labels = km$cluster

## ------------------------------------------------
## Create ClusteredSample object  (Option 1 )
## without specifying centers and covs
## we need to pass FC sample for paramter estimation
## ------------------------------------------------

clustSample = ClusteredSample(labels=cluster.labels, sample=sample)

## ------------------------------------------------
## Create ClusteredSample object  (Option 2)
## specifying centers and covs 
## no need to pass the sample
## ------------------------------------------------

centers = list()
covs = list()
num.clusters = nrow(km$centers)
for(i in 1:num.clusters)
{
  centers[[i]] = km$centers[i,]
  covs[[i]] = cov(sample[cluster.labels==i,])
}
# Now we do not need to pass sample
ClusteredSample(labels=cluster.labels, centers=centers, covs=covs)

## ------------------------------------------------
## Show summary and plot a clustered sample
## ------------------------------------------------

summary(clustSample)
plot(sample, clustSample)

Run the code above in your browser using DataLab