WARNING: The SummarizedExperiment class described here is deprecated and being
replaced with the RangedSummarizedExperiment
class defined in the new SummarizedExperiment package.
Please make sure to install the SummarizedExperiment package before
you attempt to use the SummarizedExperiment()
constructor function.
Note that this will return a
RangedSummarizedExperiment instance instead
of a SummarizedExperiment instance.
The SummarizedExperiment class is a matrix-like container where rows
represent ranges of interest (as a GRanges or
GRangesList-class
) and columns represent samples (with
sample data summarized as a DataFrame-class
). A
SummarizedExperiment
contains one or more assays, each
represented by a matrix-like object of numeric or other mode.
## Constructors
SummarizedExperiment(assays, ...)
## Accessors
assayNames(x, ...)
assayNames(x, ...) <- value
assays(x, ..., withDimnames=TRUE)
assays(x, ..., withDimnames=TRUE) <- value
assay(x, i, ...)
assay(x, i, ...) <- value
rowRanges(x, ...)
rowRanges(x, ...) <- value
colData(x, ...)
colData(x, ...) <- value
exptData(x, ...)
exptData(x, ...) <- value
"dim"(x)
"dimnames"(x)
"dimnames"(x) <- value
"dimnames"(x) <- value
## colData access
"$"(x, name)
"$"(x, name) <- value
"[["(x, i, j, ...)
"[["(x, i, j, ...) <- value
## rowRanges access
## see 'GRanges compatibility', below
## Subsetting
"["(x, i, j, ..., drop=TRUE)
"["(x, i, j) <- value
"subset"(x, subset, select, ...)
## Combining
"cbind"(..., deparse.level=1)
"rbind"(..., deparse.level=1)
## Coercion
"updateObject"(object, ..., verbose=FALSE)
"coerce"(from, to = "SummarizedExperiment", strict = TRUE)
"coerce"(from, to = "ExpressionSet", strict = TRUE)
?RangedSummarizedExperiment
in the SummarizedExperiment package.SummarizedExperiment
, see
?RangedSummarizedExperiment
in the SummarizedExperiment package. For assay
, ...
may contain withDimnames
, which is
forwarded to assays
.
For cbind, rbind
, ...
contains SummarizedExperiment
objects to be combined.
For other accessors, ignored.
logical(1)
indicating whether messages
about data coercion during construction should be printed.SummarizedExperiment
-class.assay
, assay<-
, i
is a integer or
numeric scalar; see Details for additional constraints. For [,SummarizedExperiment
, [,SummarizedExperiment<-
,
i
, j
are instances that can act to subset the
underlying rowRanges
, colData
, and matrix
elements of assays
.
For [[,SummarizedExperiment
,
[[<-,SummarizedExperiment
, i
is a scalar index (e.g.,
character(1)
or integer(1)
) into a column of
colData
.
rowRanges(x)
, is a logical vector indicating
elements or rows to keep: missing values are taken as false.colData(x)
, is a logical vector indicating
elements or rows to keep: missing values are taken as false.colData
.logical(1)
, indicating whether dimnames
should be applied to extracted assay elements. Setting
withDimnames=FALSE
increases the speed and memory efficiency
with which assays are extracted. withDimnames=TRUE
in the
getter assays<-
allows efficient complex assignments (e.g.,
updating names of assays, names(assays(x, withDimnames=FALSE))
= ...
is more efficient than names(assays(x)) = ...
); it
does not influence actual assignment of dimnames to assays.logical(1)
, ignored by these methods.?base::cbind
for a description of
this argument.SummarizedExperiment
function with arguments outlined above. x
of a
SummarizedExperiment
(e.g., from using the save
function
with a version of GenomicRanges prior to 1.9.59), it should be updated
by invoking x <- updateObject(x)
. as(from, "SummarizedExperiment")
:SummarizedExperiment
object from a ExpressionSet
object.
as(from, "ExpressionSet")
:ExpressionSet
object from a SummarizedExperiment
object.
ExpressionSet
and SummarizedExperiment
.
assayData
assays
featureData
rowData
phenoData
colData
experimentData
, annotation
,
protocolData
colData
SummarizedExperiment
being coerced uses GRanges
to store
it's range data that data will be included in the featureData
of the
ExpressionSet
. Because ExpressionSet
objects require an assay named exprs if
the SummarizedExperiment
object being coerced does not have an assay
named exprs the first assay will be renamed and a warning will be
issued.x
is a
SummarizedExperiment
instance. assays(x)
, assays(x) <- value
:value
is a list
or SimpleList
, each
element of which is a matrix with the same dimensions as
x
.assay(x, i)
, assay(x, i) <- value
:assays(x)[[i]]
, assays(x)[[i]] <-
value
) to get or set the i
th (default first) assay
element. value
must be a matrix of the same dimension as
x
, and with dimension names NULL
or consistent with
those of x
.assayNames(x)
, assayNames(x) <- value
:assay()
elements.rowRanges(x)
, rowRanges(x) <- value
:value
is a GenomicRanges
instance. Row
names of value
must be NULL or consistent with the existing
row names of x
.colData(x)
, colData(x) <- value
:value
is a DataFrame
instance. Row
names of value
must be NULL or consistent with the existing
column names of x
.exptData(x)
, exptData(x) <- value
:value
is a list
or
SimpleList
instance, with arbitrary content.dim(x)
:SummarizedExperiment
.dimnames(x)
, dimnames(x) <- value
:value
is usually a list of length 2,
containing elements that are either NULL
or vectors of
appropriate length for the corresponding dimension. value
can be NULL
, which removes dimension names. This method
implies that rownames
, rownames<-
, colnames
,
and colnames<-
are all available.GRanges-class
and
GRangesList-class
operations are supported on
SummarizedExperiment and derived instances, using
rowRanges
. Supported operations include: compare
,
countOverlaps
, coverage
,
disjointBins
, distance
,
distanceToNearest
, duplicated
,
end
, end<-
, findOverlaps
,
flank
, follow
, granges
,
isDisjoint
, match
, mcols
,
mcols<-
, narrow
, nearest
,
order
, overlapsAny
, precede
,
ranges
,
ranges<-
, rank
, resize
,
restrict
, seqinfo
,
seqinfo<-
, seqnames
,
shift
,
sort
, split
, relistToClass
,
start
, start<-
,
strand
, strand<-
,
subsetByOverlaps
, width
,
width<-
. Not all GRanges-class
operations are supported, because
they do not make sense for SummarizedExperiment objects
(e.g., length, name, as.data.frame, c, splitAsList), involve
non-trivial combination or splitting of rows (e.g., disjoin, gaps,
reduce, unique), or have not yet been implemented (Ops, map, window,
window<-). x
is a SummarizedExperiment
instance. x[i,j]
, x[i,j] <- value
:x
. i
, j
can be numeric
,
logical
, character
, or missing
. value
must be a SummarizedExperiment
instance with dimensions,
dimension names, and assay elements consistent with the subset
x[i,j]
being replaced.subset(x, subset, select)
:x
using an expression subset
referring to columns of
rowRanges(x)
(including seqnames, start,
end, width, strand, and
names(mcols(x))
) and / or select
referring to
column names of colData(x)
.colData
columns x$name
, x$name <- value
name
in x
.x[[i, ...]]
, x[[i, ...]] <- value
i
in x
....
are SummarizedExperiment
instances to be combined. cbind(...)
, rbind(...)
:cbind
combines objects with identical ranges (rowRanges
)
but different samples (columns in assays
). The colnames in
colData
must match or an error is thrown. Duplicate columns
of mcols(rowRanges(SummarizedExperiment))
must contain the same
data. Data in assays
are combined by name matching; if all names
are NULL matching is by position. A mixture of names and NULL throws an
error. rbind
combines objects with different ranges (rowRanges
)
and the same subjects (columns in assays
). Duplicate columns
of colData
must contain the same data. exptData
from all objects are combined into a
SimpleList
with no name checking. SummarizedExperiment
is implemented as an S4 class, and can be
extended in the usual way, using
contains="SummarizedExperiment"
in the new class definition. In addition, the representation of the assays
slot of
SummarizedExperiment
is as a virtual class Assays
. This
allows derived classes (contains="Assays"
) to easily implement
alternative requirements for the assays, e.g., backed by file-based
storage like NetCDF or the ff
package, while re-using the
existing SummarizedExperiment
class without modification. The
requirements on Assays
are list-like semantics (e.g.,
sapply
, [[
subsetting, names
) with elements
having matrix- or array-like semantics (e.g., dim
,
dimnames
). These requirements can be made more precise if
developers express interest. The current assays
slot is implemented as a reference class
that has copy-on-change semantics. This means that modifying non-assay
slots does not copy the (large) assay data, and at the same time the
user is not surprised by reference-based semantics. Updates to
non-assay slots are very fast; updating the assays slot itself can be
5x or more faster than with an S4 instance in the slot. One useful
technique when working with assay
or assays
function is
use of the withDimnames=FALSE
argument, which benefits speed
and memory use by not copying dimnames from the row- and colData
elements to each assay. In a little more detail, a small reference class hierarchy (not
exported from the GenomicRanges name space) defines a reference class
ShallowData
with a single field data
of type ANY
,
and a derived class ShallowSimpleListAssays
that specializes
the type of data
as SimpleList
, and
contains=c("ShallowData", "Assays")
. The assays slot contains
an instance of ShallowSimpleListAssays
. Invoking
assays()
on a SummarizedExperiment
re-dispatches from
the assays
slot to retrieve the SimpleList
from the
field of the reference class. This was achieved by implementing a
generic (not exported) value(x, name, ...)
, with a method
implemented on SummarizedExperiment
that retrieves a slot when
name
is a slot containing an S4 object in x
, and a field
when name
is a slot containing a ShallowData
instance in
x
. Copy-on-change semantics is maintained by implementing the
clone
method (clone
methods are supposed to do a deep
copy, update
methods a shallow copy; the clone
generic
is introduced, and not exported, in the GenomicRanges package). The
getter and setter code for methods implemented on
SummarizedExperiment
use value
for slot access, and
clone
for replacement. This makes it easy to implement
ShallowData
instances for other slots if the need arises. The SummarizedExperiment
class is meant for numeric and other
data types derived from a sequencing experiment. The structure is
rectangular like a matrix
, but with additional annotations on
the rows and columns, and with the possibility to manage several
assays simultaneously.
The rows of a SummarizedExperiment
instance represent ranges
(in genomic coordinates) of interest. The ranges of interest are
described by a GRanges-class
or a
GRangesList-class
instance, accessible using the
rowRanges
function, described below. The GRanges
and
GRangesList
classes contains sequence (e.g., chromosome) name,
genomic coordinates, and strand information. Each range can be
annotated with additional data; this data might be used to describe
the range or to summarize results (e.g., statistics of differential
abundance) relevant to the range. Rows may or may not have row names;
they often will not.
Each column of a SummarizedExperiment
instance represents a
sample. Information about the samples are stored in a
DataFrame-class
, accessible using the function
colData
, described below. The DataFrame
must have as
many rows as there are columns in the SummarizedExperiment
,
with each row of the DataFrame
providing information on the
sample in the corresponding column of the
SummarizedExperiment
. Columns of the DataFrame
represent
different sample attributes, e.g., tissue of origin, etc. Columns of
the DataFrame
can themselves be annotated (via the
mcols
function). Column names typically provide a short
identifier unique to each sample.
A SummarizedExperiment
can also contain information about the
overall experiment, for instance the lab in which it was conducted,
the publications with which it is associated, etc. This information is
stored as a SimpleList-class
, accessible using
the exptData
function. The form of the data associated with the
experiment is left to the discretion of the user.
The SummarizedExperiment
is appropriate for matrix-like
data. The data are accessed using the assays
function,
described below. This returns a SimpleList
-class instance. Each
element of the list must itself be a matrix (of any mode) and must
have dimensions that are the same as the dimensions of the
SummarizedExperiment
in which they are stored. Row and column
names of each matrix must either be NULL
or match those of the
SummarizedExperiment
during construction. It is convenient for
the elements of SimpleList
of assays to be named.
The SummarizedExperiment
class has the following slots; this
detail of class structure is not relevant to the user.
exptData
rowData
rowRanges
, not rowData
!
colData
assays
## WARNING: The SummarizedExperiment class is deprecated and being
## replaced with the RangedSummarizedExperiment class defined in the
## new SummarizedExperiment package. See ?RangedSummarizedExperiment
## in the SummarizedExperiment package for examples of how to create
## and manipulate RangedSummarizedExperiment objects.
Run the code above in your browser using DataLab