Learn R Programming

⚠️There's a newer version (1.6.2) of this package.Take me there.

systemPipeR: NGS workflow and report generation environment

systemPipeR is an R/Bioconductor package for building and running automated end-to-end analysis workflows for a wide range of next generation sequence (NGS) applications such as RNA-Seq, ChIP-Seq, VAR-Seq and Ribo-Seq. Important features include a uniform workflow interface across different NGS applications, automated report generation, and support for running both R and command-line software, such as NGS aligners or peak/variant callers, on local computers or compute clusters. The latter supports interactive job submissions and batch submissions to queuing systems of clusters. Efficient handling of complex sample sets and experimental designs is facilitated by a well-defined sample annotation infrastructure which improves reproducibility and user-friendliness of many typical analysis workflows in the NGS area.

Installation

To install the package, please use the biocLite method as instructed here.

To obtain the most recent updates immediately, one can install it directly from github as follows:

source("http://bioconductor.org/biocLite.R")
biocLite("tgirke/systemPipeR", build_vignettes=TRUE, dependencies=TRUE)

Usage

Instructions for running systemPipeR are given in its main vignette (manual). The sample data set used in the vignette are provided by the data package systemPipeRdata. The expected format to define NGS samples (e.g. FASTQ files) and their labels are given in targets.txt and targetsPE.txt (latter is for PE reads). The run parameters of command-line software are defined by param files that have a simplified JSON-like name/value structure. Here is a sample param file for Tophat2: tophat.param. Templates for setting up custom project reports are provided by systemPipeRdata. The corresponding PDFs of these report templates are linked here: systemPipeRNAseq, systemPipeRIBOseq, systemPipeChIPseq and systemPipeVARseq.

Slides

Copy Link

Version

Version

1.6.0

License

Artistic-2.0

Issues

Pull Requests

Stars

Forks

Maintainer

Thomas Girke

Last Published

February 15th, 2017

Functions in systemPipeR (1.6.0)

filterVars

Filter VCF files
filterDEGs

Filter and plot DEG results
featureCoverage

Genome read coverage by transcript models
clusterRun

Submit command-line tools to cluster
countRangeset

Read counting for several range sets
alignStats

Alignment statistics
catDB-class

Class "catDB"
featuretypeCounts

Plot read distribution across genomic features
genFeatures

Generate feature ranges from TxDb
catmap

catDB accessor methods
getQsubargs

Arguments for qsub
GOHyperGAll

GO term enrichment analysis for large numbers of gene sets
moduleload

Interface to module system
olBarplot

Bar plot for intersect sets
plotfeaturetypeCounts

Plot read distribution across genomic features
predORF

Predict ORFs
plotfeatureCoverage

Plot feature coverage results
overLapper

Set Intersect and Venn Diagram Functions
INTERSECTset-class

Class "INTERSECTset"
mergeBamByFactor

Merge BAM files based on factor
runCommandline

Execute SYSargs
runDiff

Differential abundance analysis for many range sets
writeTargetsRef

Generate targets file with reference
qsubRun

Submit command-line tools to cluster
preprocessReads

Run custom read preprocessing functions
run_DESeq2

Runs DESeq2
run_edgeR

Runs edgeR
sysargs

SYSargs accessor methods
systemArgs

Constructs SYSargs object from param and targets files
readComp

Import sample comparisons from targets file
returnRPKM

RPKM Normalization
VENNset-class

Class "VENNset"
writeTargetsout

Write updated targets out to file
scaleRanges

Scale spliced ranges to genome coordinates
seeFastq

Quality reports for FASTQ files
vennPlot

Plot 2-5 way Venn diagrams
variantReport

Generate Variant Report
symLink2bam

Symbolic links for IGV
SYSargs-class

Class "SYSargs"