DEGseq (version 1.26.0)

DEGexp: DEGexp: Identifying Differentially Expressed Genes from gene expression data

Description

This function is used to identify differentially expressed genes when users already have the gene expression values (or the number of reads mapped to each gene).

Usage

DEGexp(geneExpMatrix1, geneCol1=1, expCol1=2, depth1=rep(0, length(expCol1)), groupLabel1="group1",
       geneExpMatrix2, geneCol2=1, expCol2=2, depth2=rep(0, length(expCol2)), groupLabel2="group2",
       method=c("LRT", "CTR", "FET", "MARS", "MATR", "FC"), 
       pValue=1e-3, zScore=4, qValue=1e-3, foldChange=4, 
       thresholdKind=1, outputDir="none", normalMethod=c("none", "loess", "median"),
       replicateExpMatrix1=NULL, geneColR1=1, expColR1=2, depthR1=rep(0, length(expColR1)), replicateLabel1="replicate1",
       replicateExpMatrix2=NULL, geneColR2=1, expColR2=2, depthR2=rep(0, length(expColR2)), replicateLabel2="replicate2", rawCount=TRUE)

Arguments

geneExpMatrix1
gene expression matrix for replicates of sample1 (or replicate1 when method="CTR").
geneCol1
gene id column in geneExpMatrix1.
expCol1
expression value columns in geneExpMatrix1 for replicates of sample1 (numeric vector). Note: Each column corresponds to a replicate of sample1.
depth1
the total number of reads uniquely mapped to genome for each replicate of sample1 (numeric vector), default: take the total number of reads mapped to all annotated genes as the depth for each replicate.
groupLabel1
label of group1 on the plots.
geneExpMatrix2
gene expression matrix for replicates of sample2 (or replicate2 when method="CTR").
geneCol2
gene id column in geneExpMatrix2.
expCol2
expression value columns in geneExpMatrix2 for replicates of sample2 (numeric vector). Note: Each column corresponds to a replicate of sample2.
depth2
the total number of reads uniquely mapped to genome for each replicate of sample2 (numeric vector), default: take the total number of reads mapped to all annotated genes as the depth for each replicate.
groupLabel2
label of group2 on the plots.
method
method to identify differentially expressed genes. Possible methods are:
  • "LRT": Likelihood Ratio Test (Marioni et al. 2008),
  • "CTR": Check whether the variation between Technical Replicates can be explained by the random sampling model (Wang et al. 2009),
  • "FET": Fisher's Exact Test (Joshua et al. 2009),
  • "MARS": MA-plot-based method with Random Sampling model (Wang et al. 2009),
  • "MATR": MA-plot-based method with Technical Replicates (Wang et al. 2009),
  • "FC": Fold-Change threshold on MA-plot.
pValue
pValue threshold (for the methods: LRT, FET, MARS, MATR). only used when thresholdKind=1.
zScore
zScore threshold (for the methods: MARS, MATR). only used when thresholdKind=2.
qValue
qValue threshold (for the methods: LRT, FET, MARS, MATR). only used when thresholdKind=3 or thresholdKind=4.
thresholdKind
the kind of threshold. Possible kinds are:
  • 1: pValue threshold,
  • 2: zScore threshold,
  • 3: qValue threshold (Benjamini et al. 1995),
  • 4: qValue threshold (Storey et al. 2003),
  • 5: qValue threshold (Storey et al. 2003) and Fold-Change threshold on MA-plot are both required (can be used only whenmethod="MARS").
foldChange
fold change threshold on MA-plot (for the method: FC).
outputDir
the output directory.
normalMethod
the normalization method: "none", "loess", "median" (Yang et al. 2002). recommend: "none".
replicateExpMatrix1
matrix containing gene expression values for replicate batch1 (only used when method="MATR"). Note: replicate1 and replicate2 are two (groups of) technical replicates of a sample.
geneColR1
gene id column in the expression matrix for replicate batch1 (only used when method="MATR").
expColR1
expression value columns in the expression matrix for replicate batch1 (numeric vector) (only used when method="MATR").
depthR1
the total number of reads uniquely mapped to genome for each replicate in replicate batch1 (numeric vector), default: take the total number of reads mapped to all annotated genes as the depth for each replicate (only used when method="MATR").
replicateLabel1
label of replicate batch1 on the plots (only used when method="MATR").
replicateExpMatrix2
matrix containing gene expression values for replicate batch2 (only used when method="MATR"). Note: replicate1 and replicate2 are two (groups of) technical replicates of a sample.
geneColR2
gene id column in the expression matrix for replicate batch2 (only used when method="MATR").
expColR2
expression value columns in the expression matrix for replicate batch2 (numeric vector) (only used when method="MATR").
depthR2
the total number of reads uniquely mapped to genome for each replicate in replicate batch2 (numeric vector), default: take the total number of reads mapped to all annotated genes as the depth for each replicate (only used when method="MATR").
replicateLabel2
label of replicate batch2 on the plots (only used when method="MATR").
rawCount
a logical value indicating the gene expression values are based on raw read counts or normalized values.

References

Benjamini,Y. and Hochberg,Y (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B 57, 289-300. Jiang,H. and Wong,W.H. (2008) Statistical inferences for isoform expression in RNA-seq. Bioinformatics, 25, 1026-1032.

Bloom,J.S. et al. (2009) Measuring differential gene expression by short read sequencing: quantitative comparison to 2-channel gene expression microarrays. BMC Genomics, 10, 221.

Marioni,J.C. et al. (2008) RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res., 18, 1509-1517. Storey,J.D. and Tibshirani,R. (2003) Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. 100, 9440-9445.

Wang,L.K. and et al. (2010) DEGseq: an R package for identifying differentially expressed genes from RNA-seq data, Bioinformatics 26, 136 - 138. Yang,Y.H. et al. (2002) Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Research, 30, e15.

See Also

DEGexp2, DEGseq, getGeneExp, readGeneExp, GeneExpExample1000, GeneExpExample5000.

Examples

Run this code
## kidney: R1L1Kidney, R1L3Kidney, R1L7Kidney, R2L2Kidney, R2L6Kidney 
  ## liver: R1L2Liver, R1L4Liver, R1L6Liver, R1L8Liver, R2L3Liver
  
  geneExpFile <- system.file("extdata", "GeneExpExample5000.txt", package="DEGseq")
  cat("geneExpFile:", geneExpFile, "")
  outputDir <- file.path(tempdir(), "DEGexpExample")
  geneExpMatrix1 <- readGeneExp(file=geneExpFile, geneCol=1, valCol=c(7,9,12,15,18))
  geneExpMatrix2 <- readGeneExp(file=geneExpFile, geneCol=1, valCol=c(8,10,11,13,16))
  geneExpMatrix1[30:32,]
  geneExpMatrix2[30:32,]
  DEGexp(geneExpMatrix1=geneExpMatrix1, geneCol1=1, expCol1=c(2,3,4,5,6), groupLabel1="kidney",
         geneExpMatrix2=geneExpMatrix2, geneCol2=1, expCol2=c(2,3,4,5,6), groupLabel2="liver",
         method="LRT", outputDir=outputDir)
  cat("outputDir:", outputDir, "")

Run the code above in your browser using DataLab