This data set contains mRNA and protein-antibody data on T-regulatory immune cells. It is a subset of a much larger data set collected from the peripheral blood of patients with a variety of health comnditions.
data(treg)
Note that there are three distinct objects
included in the data set: treg
, tmat
, and rip
.
treg
A numerical data matrix with 538 rows and 255
columns. Each column represents a single cell from one of 61
samples that were assayed by (mixed-omics) single cell
sequencing. Each row represents one of the features that was
measured in the assay. Of these, 51 are antibodies that were
tagged with an RNA barcode to identify them; their names all end
with the string pAbO
. The remaining 487 features are mRNA
measurements, named by their official gene symbol at the time the
experiment was performed. Each column represents a different
single cell. This matrix is a subset of a more complete data set
of T regulatory cells (Tregs). It was produced using the
downsample
function from the Mercator
package, which was in turn inspired by a similar routine used in
the SPADE algorithm by Peng Qiu. A key point of the algorithm is
to make sampling less likely from the densest part of the
distribution in order to preserve rare cell types in the
population.
tmat
A distance matrix, stored as a
dist
object, produced using Pearson correlation as a
measure of distance between sigle-cell vectors in the treg
data set.
rip
This object is a "Rips diagram". It was produced
by running the the ripsDiag
function from the
TDA
R package on the treg
subset of single cells.
Kevin R. Coombes krc@silicovore.com, Jake Reed hreed@augusta.edu
Qiu P, Simonds EF, Bendall SC, Gibbs KD Jr, Bruggner RV, Linderman MD, Sachs K, Nolan GP, Plevritis SK. Extracting a cellular hierarchy from high-dimensional cytometry data with SPADE. Nat Biotechnol. 2011 Oct 2;29(10):886-91. doi: 10.1038/nbt.1991.