plot: Plot a log fold-change versus log average expression (so-called M-A plot)

Description

This function generates a scatter plot of log fold-change (i.e., $M = log2(G2) - log2(G1)$ on the $y$-axis between Groups 1 vs. 2) versus log average expression (i.e., $A = (log2(G1) + log2(G2)) / 2$ on the $x$-axis) using normalized count data.

Usage

"plot"(x, FDR = NULL, median.lines = FALSE, floor = 0, group = NULL, col = NULL, col.tag = NULL, normalize = TRUE, ...)

Arguments

TCC-class object.

FDR

numeric scalar specifying a false discovery rate (FDR) threshold for determining differentially expressed genes (DEGs)

median.lines

logical. If TRUE, horizontal lines specifying the median M values for non-DEGs (black) and DEGs (red) are drawn.

floor

numeric scalar specifying a threshold for adjusting low count data.

group

numeric vector consists two elements for specifying what two groups should be drawn when data contains more than three groups.

col

vector specifying plotting color.

col.tag

numeric vector spacifying the index of col for coloring the points of the genes.

normalize

logical. If FALSE, the coordinates of M-A plot are calculated from the raw data.

...

further graphical arguments, see plot.default.

Value

A scatter plot to the current graphic device.

Details

This function generates roughly three different M-A plots depending on the conditions for TCC-class objects. When the function is performed just after the new method, all the genes (points) are treated as non-DEGs (the default is black; see Example 1). The simulateReadCounts function followed by the plot function can classify the genes as true non-DEGs (black), true DEGs. (see Example 2). The estimateDE function followed by the plot function generates estimated DEGs (magenta) and the remaining estimated non-DEGs (black).

Genes with normalized counts of 0 in any one group cannot be plotted on the M-A plot because those M and A values cannot be calculated (as $\log 0$ is undefined). Those points are plotted at the left side of the M-A plot, depending on the minimum A (i.e., log average expression) value. The $x$ coordinate of those points is the minimum A value minus one. The $y$ coordinate is calculated as if the zero count was the minimum observed non zero count in each group.

Examples

Run this code

# Example 1. 
# M-A plotting just after constructing the TCC class object from
# hypoData. In this case, the plot is generated from hypoData
# that has been scaled in such a way that the library sizes of 
# each sample are the same as the mean library size of the
# original hypoData. Note that all points are in black. This is
# because the information about DEG or non-DEG for each gene is 
# not indicated.
data(hypoData)
group <- c(1, 1, 1, 2, 2, 2)
tcc <- new("TCC", hypoData, group)
plot(tcc)

normalized.count <- getNormalizedData(tcc)
colSums(normalized.count)
colSums(hypoData)
mean(colSums(hypoData))


# Example 2. 
# M-A plotting of DEGES/edgeR-normalized simulation data.
# It can be seen that the median M value for non-DEGs approaches
# zero. Note that non-DEGs are in black, DEGs are in red.
tcc <- simulateReadCounts()
tcc <- calcNormFactors(tcc, norm.method = "tmm", test.method = "edger",
                       iteration = 1, FDR = 0.1, floorPDEG = 0.05)
plot(tcc, median.lines = TRUE)


# Example 3. 
# M-A plotting of DEGES/edgeR-normalized hypoData after performing
# DE analysis.
data(hypoData)
group <- c(1, 1, 1, 2, 2, 2)
tcc <- new("TCC", hypoData, group)
tcc <- calcNormFactors(tcc, norm.method = "tmm", test.method = "edger",
                       iteration = 1, FDR = 0.1, floorPDEG = 0.05)
tcc <- estimateDE(tcc, test.method = "edger", FDR = 0.1)
plot(tcc)

# Changing the FDR threshold
plot(tcc, FDR = 0.7)

Run the code above in your browser using DataLab