Learn R Programming

sRAP (version 1.12.0)

RNA.deg: Differential Expression Statistics

Description

Provides a table of differenitally expressed genes (in .xlsx format) as well as differential expression statistics for all genes (in .xlsx format as well as returned data frame). Function automatically creates a heatmap for differentially expressed genes and user can optionally also create box-plots for each individual differentially expressed gene. The efficacy of this protocol is described in [1].

Output files will be created in the "DEG" and "Raw_Data" subfolders.

Usage

RNA.deg(sample.file, expression.table, project.name, project.folder, log2.fc.cutoff = 0.58, pvalue.cutoff = 0.05, fdr.cutoff = 0.05, box.plot = TRUE, ref.group = FALSE, ref = "none", method = "lm", color.palette = c("green", "orange", "purple", "cyan", "pink", "maroon", "yellow", "grey", "black", colors()), legend.status = FALSE)

Arguments

sample.file
Tab-delimited text file providing group attributions for all samples considered for analysis.
expression.table
Data frame with genes in columns and samples in rows. Data should be log2 transformed. The RNA.norm function automatically creates this file.
project.name
Name for sRAP project. This determines the names for output files.
project.folder
Folder for sRAP output files
log2.fc.cutoff
If the primary variable contains two groups with a specified reference, this is the cut-off to define differentially expressed genes (default = 1.5, on a linear scale). Otherwise, this variable is ignored
pvalue.cutoff
Minimum p-value to define differentially expressed genes
fdr.cutoff
Minimum false discovery rate (FDR) to define differentially expressed genes.
box.plot
A logical value: Should box-plots be created for all differenitally expressed genes? If TRUE, then box-plots will be created in a separate subfolder.
ref.group
A logical value: Is the primary variable 2 groups, with a reference group?
ref
If the primary variable contains two groups (indicated by ref.group = FALSE), this is the reference used to calculate fold-change values (so, the mean expression for the reference group is substracted from the treatment group). Otherwise, this variable is ignored
method
Method for calculating p-values: "lm" (Default) = linear regression "aov" = ANOVA
color.palette
Colors for primary variable (specified in the second column of the sample file). If the primary variable is a continuous variable, this parameter is ignored.
legend.status
Logical value. Should legend be added to heatmap?

Value

Data frame containing differential expression statistics.First column contains gene name.If the primary variable contains two groups (with a specified reference), then fold-change values are provided in the second column.P-values and FDR values are provided for each variable in subsequent columns, starting with the primary variable.

References

[1] Warden CD, Yuan Y-C, and Wu X. (2013). Optimal Calculation of RNA-Seq Fold-Change Values. Int J Comput Bioinfo In Silico Model, 2(6): 285-292

See Also

sRAP goes through an entire analysis for an example dataset provided with the sRAP package.

Please post questions on the sRAP discussion group: http://sourceforge.net/p/bdfunc/discussion/srap/

Examples

Run this code
	
library("sRAP")

dir <- system.file("extdata", package="sRAP")
expression.table <- file.path(dir,"MiSeq_cufflinks_genes_truncate.txt")
sample.table <- file.path(dir,"MiSeq_Sample_Description.txt")
project.folder <- getwd()
project.name <- "MiSeq_DEG"

expression.mat <- RNA.norm(expression.table, project.name, project.folder)

stat.table <- RNA.deg(sample.table, expression.mat, project.name, project.folder, box.plot=FALSE, ref.group=TRUE, ref="scramble",method="aov", color.palette=c("green","orange"), legend.status=TRUE)

#stat.table <- RNA.deg(sample.table, expression.mat, project.name, project.folder, box.plot=FALSE, #ref.group=TRUE, ref="scramble",method="aov", color.palette=c("green","orange"))

Run the code above in your browser using DataLab