TCGAbiolinks (version 1.2.5)

TCGAanalyze_DMR: Differentially methylated regions Analysis

Description

This function will search for differentially methylated CpG sites, which are regarded as possible functional regions involved in gene transcriptional regulation.

In order to find these regions we use the beta-values (methylation values ranging from 0.0 to 1.0) to compare two groups.

Firstly, it calculates the difference between the mean methylation of each group for each probes. Secondly, it calculates the p-value using the wilcoxon test using the Benjamini-Hochberg adjustment method. The default parameters will require a minimum absolute beta values delta of 0.2 and a false discovery rate (FDR)-adjusted Wilcoxon rank-sum P-value of < 0.01 for the difference.

After these analysis, we save a volcano plot (x-axis:diff mean methylation, y-axis: significance) that will help the user identify the differentially methylated CpG sites and return the object with the calculus in the rowRanges.

If the calculus already exists in the object it will not recalculated. You should set overwrite parameter to TRUE to force it, or remove the collumns with the results from the object.

Usage

TCGAanalyze_DMR(data, groupCol = NULL, group1 = NULL, group2 = NULL, plot.filename = "methylation_volcano.pdf", ylab = expression(paste(-Log[10], " (FDR corrected -P values)")), xlab = expression(paste("DNA Methylation difference (", beta, "-values)")), title = NULL, legend = "Legend", color = c("black", "red", "darkgreen"), label = NULL, xlim = NULL, ylim = NULL, p.cut = 0.01, probe.names = FALSE, diffmean.cut = 0.2, paired = FALSE, adj.method = "BH", overwrite = FALSE, cores = 1, save = TRUE, filename = NULL)

Arguments

data
SummarizedExperiment obtained from the TCGAPrepare
groupCol
Columns with the groups inside the SummarizedExperiment object. (This will be obtained by the function colData(data))
group1
In case our object has more than 2 groups, you should set the name of the group
group2
In case our object has more than 2 groups, you should set the name of the group
plot.filename
Filename. Default: volcano.pdf, volcano.svg, volcano.png
ylab
y axis text
xlab
x axis text
title
main title. If not specified it will be "Volcano plot (group1 vs group2)
legend
Legend title
color
vector of colors to be used in graph
label
vector of labels to be used in the figure. Example: c("Not Significant","Hypermethylated in group1", "Hypomethylated in group1"))
xlim
x limits to cut image
ylim
y limits to cut image
p.cut
p values threshold. Default: 0.01
probe.names
is probe.names
diffmean.cut
diffmean threshold. Default: 0.2
paired
Wilcoxon paired parameter. Default: FALSE
adj.method
Adjusted method for the p-value calculation
overwrite
Overwrite the pvalues and diffmean values if already in the object for both groups? Default: FALSE
cores
Number of cores to be used in the non-parametric test Default = groupCol.group1.group2.rda
save
Save object with results? Default: TRUE
filename
Name of the file to save the object.

Value

Volcano plot saved and the given data with the results (diffmean.group1.group2,p.value.group1.group2, p.value.adj.group1.group2,status.group1.group2) in the rowRanges where group1 and group2 are the names of the groups

Examples

Run this code
nrows <- 200; ncols <- 20
counts <- matrix(runif(nrows * ncols, 1, 1e4), nrows)
rowRanges <- GenomicRanges::GRanges(rep(c("chr1", "chr2"), c(50, 150)),
                   IRanges::IRanges(floor(runif(200, 1e5, 1e6)), width=100),
                    strand=sample(c("+", "-"), 200, TRUE),
                    feature_id=sprintf("ID%03d", 1:200))
colData <- S4Vectors::DataFrame(Treatment=rep(c("ChIP", "Input"), 5),
                    row.names=LETTERS[1:20],
                    group=rep(c("group1","group2"),c(10,10)))
data <- SummarizedExperiment::SummarizedExperiment(
         assays=S4Vectors::SimpleList(counts=counts),
         rowRanges=rowRanges,
         colData=colData)
SummarizedExperiment::colData(data)$group <- c(rep("group1",ncol(data)/2),
                         rep("group2",ncol(data)/2))
hypo.hyper <- TCGAanalyze_DMR(data, p.cut = 0.85,"group","group1","group2")

Run the code above in your browser using DataLab