Last chance! 50% off unlimited learning
Sale ends in
Visualise microarray and RNAseq data with gene ontology annotations.
This package was designed for the analysis of bioinformatics data based on gene expression measurements. It requires two input values:
should be a gene-by-sample matrix providing the expression level of genes (rows) in each sample (columns). Row names are expected to be either Ensembl gene identifiers or probeset identifiers present in microarrays present in the Ensembl BioMart dataset queried. The phenoData slot should be an AnnotatedDataFrame from the Biobase package providing phenotypic information about the samples. Row names are samples, at least one of the columns must be a grouping factor with two or more levels (factor in the actual meaning of the R language). 2. the name of the grouping factor to investigate, which must be a valid column name in the phenoData.
The analysis scores all Gene Ontology (GO) terms represented in the gene annotations provided, or semi-automatically retrieved from the current Ensembl annotation release, using the biomaRt package. In the default approach, the random forest framework is used to evaluate the ability of each gene feature in the ExpressionSet to cluster groups of samples according to a known experimental factor. Notably, genes associated with the GO term in the annotations but absent from the dataset are assigned a score of 0 and a rank equal to the number of gene features in the ExpressionSet plus one. GO terms are scored and ranked on the average rank (alternatively, score) of all associated genes (including those absent from the ExpressionSet).
Functions are provided to investigate and visualise the results of the above analysis. The score table can be filtered for GO terms passing given thresholds. The distribution of scores can be visualised. The quantiles of scores can be obtained. The genes associated with a given GO term can be listed, with or without descriptive information. Hierarchical clustering of the samples can be performed based on the expression levels of genes associated with a given GO term. Heatmaps accompanied by hierarchical clustering of samples and genes can be drawn. The expression profile of genes can be plotted against any factor while grouping samples on another factor. The univariate effect of all factors can be visualised on the expression level of genes associated with a GO term. The counts of overlapping genes between multiple GO terms can be visualised in a Venn diagram. The result variable of the analysis can be re-ordered according to gene rank or score.
microarray probeset identifiers.
Suppots custom annotations for gene identifiers not automatically supported.
GO_analyse() scores all Gene Ontology (GO) terms represented in
the dataset based on the estimated average ability of their associated genes to cluster samples according to a predefined grouping factor. It also returns the table used to map genes to GO terms, the table summarising the statistics for each gene, and finally the essential parameters of the analysis performed, for reproducibility. Additional information specific to each statistical framework are included in the output object.
of GO term ranking, which may subsequently be used for filtering.
desired filters and returns a list formatted identically to the output of GO_analyse() with the filtered information.
output of GO_analyse() or subset_scores().
to defined percentiles.
with a given GO term.
identifiers associated with a given GO term.
based on the expression levels of genes associated with a given GO term.
and genes based on the expression levels of genes associated with a given GO term.
identifier, given valid variable name for the X-axis and a grouping factor for the Y-axis.
feature identifier(s) annotated to a gene symbol, given valid variable name for the X-axis and a grouping factor for the Y-axis.
sample series while colouring-coding each series according to its group; a more detailed alternative to expression_plot().
given sample series while colouring-coding each series according to its group; a more detailed alternative to expression_plot_symbol().
factor available in the phenoData on the expression levels of genes associated with a GO term.
between 2-5 GO terms. This can either display to screen or print to directly to file.
genes either by increasing (average) rank or decreasing (average) score.
a particular set of values in given columns of their phenotypic data (e.g. only samples from "2H" and "6H" in their "Time" information).
package.
included example input objects.