Volcano plots represent a useful way to visualise the results of differential expression analyses. Here, we present a highly-configurable function that produces publication-ready volcano plots. EnhancedVolcano will attempt to fit as many transcript names in the plot window as possible, thus avoiding 'clogging' up the plot with labels that could not otherwise have been read. Other functionality allows the user to identify up to 3 different types of attributes in the same plot space via colour, shape, and shade parameter configurations.
EnhancedVolcano(
toptable,
lab,
x,
y,
selectLab = NULL,
xlim = c(min(toptable[,x], na.rm=TRUE),
max(toptable[,x], na.rm=TRUE)),
ylim = c(0, max(-log10(toptable[,y]), na.rm=TRUE) + 5),
xlab = bquote(~Log[2]~ "fold change"),
ylab = bquote(~-Log[10]~italic(P)),
axisLabSize = 18,
title = 'Volcano plot',
subtitle = 'EnhancedVolcano',
caption = paste0('Total = ', nrow(toptable), ' variables'),
titleLabSize = 18,
subtitleLabSize = 14,
captionLabSize = 14,
pCutoff = 10e-6,
pLabellingCutoff = pCutoff,
FCcutoff = 1.0,
cutoffLineType = 'longdash',
cutoffLineCol = 'black',
cutoffLineWidth = 0.4,
transcriptPointSize = 0.8,
transcriptLabSize = 3.0,
transcriptLabCol = 'black',
transcriptLabFace = 'plain',
transcriptLabhjust = 0,
transcriptLabvjust = 1.5,
boxedlabels = FALSE,
shape = 19,
shapeCustom = NULL,
col = c("grey30", "forestgreen", "royalblue", "red2"),
colCustom = NULL,
colAlpha = 1/2,
legend = c("NS","Log2 FC","P","P & Log2 FC"),
legendLabels = c('NS', expression(Log[2]~FC),
"p-value", expression(p-value~and~log[2]~FC)),
legendPosition = "top",
legendLabSize = 14,
legendIconSize = 4.0,
legendVisible = TRUE,
shade = NULL,
shadeLabel = NULL,
shadeAlpha = 1/2,
shadeFill = "grey",
shadeSize = 0.01,
shadeBins = 2,
drawConnectors = FALSE,
widthConnectors = 0.5,
typeConnectors = 'closed',
endsConnectors = 'first',
lengthConnectors = unit(0.01, 'npc'),
colConnectors = 'grey10',
hline = NULL,
hlineType = 'longdash',
hlineCol = 'black',
hlineWidth = 0.4,
vline = NULL,
vlineType = 'longdash',
vlineCol = 'black',
vlineWidth = 0.4,
gridlines.major = TRUE,
gridlines.minor = TRUE,
border = "partial",
borderWidth = 0.8,
borderColour = "black")
A data-frame of test statistics (if not, a data frame, an attempt will be made to convert it to one). Requires at least the following: column for transcript names (can be rownames); a column for log2 fold changes; a column for nominal or adjusted p-value. REQUIRED.
A column name in toptable containing transcript names. Can be rownames(toptable). REQUIRED.
A column name in toptable containing log2 fold changes. REQUIRED.
A column name in toptable containing nominal or adjusted p-values. REQUIRED.
A vector containing a subset of lab. DEFAULT = NULL. OPTIONAL.
Limits of the x-axis. DEFAULT = c(min(toptable[,x], na.rm=TRUE), max(toptable[,x], na.rm=TRUE)). OPTIONAL.
Limits of the y-axis. DEFAULT = c(0, max(-log10(toptable[,y]), na.rm=TRUE) + 5). OPTIONAL.
Label for x-axis. DEFAULT = bquote(~Log[2]~ "fold change"). OPTIONAL.
Label for y-axis. DEFAULT = bquote(~-Log[10]~italic(P)). OPTIONAL.
Size of x- and y-axis labels. DEFAULT = 18. OPTIONAL.
Plot title. DEFAULT = 'Volcano plot'. OPTIONAL.
Plot subtitle. DEFAULT = 'EnhancedVolcano'. OPTIONAL.
Plot caption. DEFAULT = paste0('Total = ', nrow(toptable), ' variables'). OPTIONAL.
Size of plot title. DEFAULT = 18. OPTIONAL.
Size of plot subtitle. DEFAULT = 14. OPTIONAL.
Size of plot caption. DEFAULT = 14. OPTIONAL.
Cut-off for statistical significance. A horizontal line will be drawn at -log10(pCutoff). DEFAULT = 10e-6. OPTIONAL.
Labelling cut-off for statistical significance. DEFAULT = pCutoff. OPTIONAL
Cut-off for absolute log2 fold-change. Vertical lines will be drawn at the negative and positive values of log2FCcutoff. DEFAULT = 1.0. OPTIONAL.
Line type for FCcutoff and pCutoff ("blank", "solid", "dashed", "dotted", "dotdash", "longdash", "twodash"). DEFAULT = "longdash". OPTIONAL.
Line colour for FCcutoff and pCutoff. DEFAULT = "black". OPTIONAL.
Line width for FCcutoff and pCutoff. DEFAULT = 0.4. OPTIONAL.
Size of plotted points for each transcript. DEFAULT = 0.8. OPTIONAL.
Size of labels for each transcript. DEFAULT = 3.0. OPTIONAL.
Colour of labels for each transcript. DEFAULT = 'black'. OPTIONAL.
Font face of labels for each transcript. DEFAULT = 'plain'. OPTIONAL.
Horizontal adjustment of label for each transcript. DEFAULT = 0. OPTIONAL.
Vertical adjustment of label for each transcript. DEFAULT = 1.5. OPTIONAL.
Logical, indicating whether or not to draw labels in boxes. DEFAULT = FALSE. OPTIONAL.
Shape of the plotted points. Either a single value for all points, or 4 values corresponding to < abs(FCcutoff) && > pCutoff, > abs(FCcutoff), < pCutoff, > abs(FCcutoff) && < pCutoff. DEFAULT = 19. OPTIONAL.
Named vector / key-value pairs that will over-ride the default shape scheme. The order must match that of toptable. Names / keys relate to groups / categories; values relate to shape encodings. DEFAULT = NULL. OPTIONAL.
Colour shading for plotted points, corresponding to < abs(FCcutoff) && > pCutoff, > abs(FCcutoff), < pCutoff, > abs(FCcutoff) && < pCutoff. DEFAULT = c("grey30", "forestgreen", "royalblue", "red2"). OPTIONAL.
Named vector / key-value pairs that will over-ride the default colour scheme. The order must match that of toptable. Names / keys relate to groups / categories; values relate to colour. DEFAULT = NULL. OPTIONAL.
Alpha for purposes of controlling colour transparency of transcript points. DEFAULT = 1/2. OPTIONAL.
Plot legend key. DEFAULT = c("NS", "Log2 FC", "P", "P & Log2 FC"). OPTIONAL.
Plot legend text labels. DEFAULT = c('NS', expression(Log[2]~FC), "p-value", expression(p-value~and~log[2]~FC). OPTIONAL
Position of legend ("top", "bottom", "left", "right"). DEFAULT = "top". OPTIONAL.
Size of plot legend text. DEFAULT = 14. OPTIONAL.
Size of plot legend icons / symbols. DEFAULT = 4.0. OPTIONAL.
Logical, indicating whether or not to show the legend. DEFAULT = TRUE. OPTIONAL.
A vector of transcript names to shade. DEFAULT = NULL. OPTIONAL.
Label for the transcrips to shade. DEFAULT = NULL. OPTIONAL.
Alpha for purposes of controlling colour transparency of shaded regions. DEFAULT = 1/2. OPTIONAL.
Colour of shaded regions. DEFAULT = "grey". OPTIONAL.
Size of the shade contour lines. DEFAULT = 0.01. OPTIONAL.
Number of bins for the density of the shade. DEFAULT = 2. OPTIONAL.
Logical, indicating whether or not to connect plot labels to their corresponding points by line connectors. DEFAULT = FALSE. OPTIONAL.
Line width of connectors. DEFAULT = 0.5. OPTIONAL.
Have the arrow head open or filled ('closed')? ('open', 'closed'). DEFAULT = 'closed'. OPTIONAL.
Which end of connectors to draw arrow head? ('last', 'first', 'both'). DEFAULT = 'first'. OPTIONAL.
Length of the connectors. DEFAULT = unit(0.01, 'npc'). OPTIONAL
Line colour of connectors. DEFAULT = 'grey10'. OPTIONAL.
Draw one or more horizontal lines passing through this/these values on y-axis. For single values, only a single numerical value is necessary. For multiple lines, pass these as a vector, e.g., c(60,90). DEFAULT = NULL. OPTIONAL.
Line type for hline ('blank', 'solid', 'dashed', 'dotted', 'dotdash', 'longdash', 'twodash'). DEFAULT = 'longdash'. OPTIONAL.
Colour of hline. DEFAULT = 'black'. OPTIONAL.
Width of hline. DEFAULT = 0.4. OPTIONAL.
Draw one or more vertical lines passing through this/these values on x-axis. For single values, only a single numerical value is necessary. For multiple lines, pass these as a vector, e.g., c(60,90). DEFAULT = NULL. OPTIONAL.
Line type for vline ('blank', 'solid', 'dashed', 'dotted', 'dotdash', 'longdash', 'twodash'). DEFAULT = 'longdash'. OPTIONAL.
Colour of vline. DEFAULT = 'black'. OPTIONAL.
Width of vline. DEFAULT = 0.4. OPTIONAL.
Logical, indicating whether or not to draw major gridlines. DEFAULT = TRUE. OPTIONAL
Logical, indicating whether or not to draw minor gridlines. DEFAULT = TRUE. OPTIONAL
Add a border for just the x and y axes ('partial') or the entire plot grid ('full')? DEFAULT = 'partial'. OPTIONAL.
Width of the border on the x and y axes. DEFAULT = 0.8. OPTIONAL.
Colour of the border on the x and y axes. DEFAULT = "black". OPTIONAL.
A ggplot2
object.
Volcano plots represent a useful way to visualise the results of differential expression analyses. Here, we present a highly-configurable function that produces publication-ready volcano plots [@EnhancedVolcano]. EnhancedVolcano will attempt to fit as many transcript names in the plot window as possible, thus avoiding 'clogging' up the plot with labels that could not otherwise have been read.
# NOT RUN {
library("pasilla")
pasCts <- system.file("extdata", "pasilla_gene_counts.tsv",
package="pasilla", mustWork=TRUE)
pasAnno <- system.file("extdata", "pasilla_sample_annotation.csv",
package="pasilla", mustWork=TRUE)
cts <- as.matrix(read.csv(pasCts,sep="\t",row.names="gene_id"))
coldata <- read.csv(pasAnno, row.names=1)
coldata <- coldata[,c("condition","type")]
rownames(coldata) <- sub("fb", "", rownames(coldata))
cts <- cts[, rownames(coldata)]
library("DESeq2")
dds <- DESeqDataSetFromMatrix(countData = cts,
colData = coldata,
design = ~ condition)
featureData <- data.frame(gene=rownames(cts))
mcols(dds) <- DataFrame(mcols(dds), featureData)
dds <- DESeq(dds)
res <- results(dds)
EnhancedVolcano(res,
lab = rownames(res),
x = "log2FoldChange",
y = "pvalue",
pCutoff = 10e-4,
FCcutoff = 1.333,
xlim = c(-5.5, 5.5),
ylim = c(0, -log10(10e-12)),
transcriptPointSize = 1.5,
transcriptLabSize = 2.5,
shape = c(6, 6, 19, 16),
title = "DESeq2 results",
subtitle = "Differential expression",
caption = "FC cutoff, 1.333; p-value cutoff, 10e-4",
legendPosition = "right",
legendLabSize = 14,
col = c("grey30", "forestgreen", "royalblue", "red2"),
colAlpha = 0.9,
drawConnectors = TRUE,
hline = c(10e-8),
widthConnectors = 0.5)
# }
Run the code above in your browser using DataLab