Calculates average expression of genes grouped by common segment membership.
aggregateSegmentExpression(epg, segments, dataset="hsapiens_gene_ensembl",
mingps = 20, GRCh = 37, host=NULL)
List with fields:
Segment-by-cell matrix of expression values.
Segment-by-cell matrix of the number of expressed genes.
Gene-by-cell matrix of expression. Recommendation is to cap extreme UMI counts (e.g. at the 99% quantile) and to include only cells expressing at least 1,000 genes.
Matrix in which each row corresponds to a copy number segment as calculated by a circular binary segmentation algorithm. Has to contain at least the following column names:
chr - chromosome;
startpos - the first genomic position of a copy number segment;
endpos - the last genomic position of a copy number segment;
CN_Estimate - the copy number estimated for each segment.
Dataset to download from BioMart.
Minimum number of expressed genes a segment needs to contain in order to be included in output.
Human reference genome version to be used for annotating gene coordinates.
Host address used by BioMart.
Noemi Andor
Let S := { \(S_1, S_2, ... S_n\) } be the set of \(n\) genomic segments that have been obtained from DNA-sequencing a given sample (e.g. from bulk exome-sequencing, scDNA-sequencing, etc.). Genes are mapped to their genomic coordinates using the biomaRt package and assigned to a segment based on their coordinates. Genes are grouped by their segment membership, to obtain the average number of UMIs and the number of expressed genes per segment \(S_j\) per cell i.
data(epg)
data(segments)
# \donttest{
X=aggregateSegmentExpression(epg, segments, mingps=20, GRCh=38)
# }
Run the code above in your browser using DataLab