Last chance! 50% off unlimited learning
Sale ends in
transcriptsBy(x, by=c("gene", "exon", "cds"), ...)
"transcriptsBy"(x, by=c("gene", "exon", "cds"), use.names=FALSE)
exonsBy(x, by=c("tx", "gene"), ...)
"exonsBy"(x, by=c("tx", "gene"), use.names=FALSE)
cdsBy(x, by=c("tx", "gene"), ...)
"cdsBy"(x, by=c("tx", "gene"), use.names=FALSE)
intronsByTranscript(x, ...)
"intronsByTranscript"(x, use.names=FALSE)
fiveUTRsByTranscript(x, ...)
"fiveUTRsByTranscript"(x, use.names=FALSE)
threeUTRsByTranscript(x, ...)
"threeUTRsByTranscript"(x, use.names=FALSE)
"gene"
, "exon"
, "cds"
or "tx"
.
Determines the grouping.use.names
is FALSE
), the
names of this GRangesList object
(aka the group names) are the internal ids of the features
used for grouping (aka the grouping features), which are
guaranteed to be unique.
If use.names
is TRUE
, then the names of the
grouping features are used instead of their internal ids.
For example, when grouping by transcript (by="tx"
),
the default group names are the transcript internal ids
("tx_id"
). But, if use.names=TRUE
, the group
names are the transcript names ("tx_name"
).
Note that, unlike the feature ids, the feature names are not
guaranteed to be unique or even defined (they could be all
NA
s). A warning is issued when this happens.
See ?id2name
for more information about
feature internal ids and feature external names and how
to map the formers to the latters. Finally, use.names=TRUE
cannot be used when grouping
by gene by="gene"
. This is because, unlike for the
other features, the gene ids are external ids (e.g. Entrez
Gene or Ensembl ids) so the db doesn't have a "gene_name"
column for storing alternate gene names.
When using exonsBy
or cdsBy
with by = "tx"
,
the returned exons or CDS are ordered by ascending rank for each
transcript, that is, by their position in the transcript.
In all other cases, the ranges will be ordered by chromosome, strand,
start, and end values.
transcripts
and transcriptsByOverlaps
for more ways to extract genomic features
from a TxDb object.
transcriptLengths
for extracting the transcript
lengths from a TxDb object.
extractTranscriptSeqs
for extracting transcript
(or CDS) sequences from chromosome sequences.
coverageByTranscript
for computing coverage by
transcript (or CDS) of a set of ranges.
id2name
for mapping TxDb internal ids
to external names for a given feature type.
txdb_file <- system.file("extdata", "hg19_knownGene_sample.sqlite",
package="GenomicFeatures")
txdb <- loadDb(txdb_file)
## Get the transcripts grouped by gene:
transcriptsBy(txdb, "gene")
## Get the exons grouped by gene:
exonsBy(txdb, "gene")
## Get the CDS grouped by transcript:
cds_by_tx0 <- cdsBy(txdb, "tx")
## With more informative group names:
cds_by_tx1 <- cdsBy(txdb, "tx", use.names=TRUE)
## Note that 'cds_by_tx1' can also be obtained with:
names(cds_by_tx0) <- id2name(txdb, feature.type="tx")[names(cds_by_tx0)]
stopifnot(identical(cds_by_tx0, cds_by_tx1))
## Get the introns grouped by transcript:
intronsByTranscript(txdb)
## Get the 5' UTRs grouped by transcript:
fiveUTRsByTranscript(txdb)
fiveUTRsByTranscript(txdb, use.names=TRUE) # more informative group names
Run the code above in your browser using DataLab