Learn R Programming

⚠️There's a newer version (2.3.1) of this package.Take me there.

The SafeQuant Package includes methods for analysis of quantitative (LFQ,TMT,HRM) Proteomics data.

Installation

1) Install Dependencies

A) Install CRAN library dependencies (open R)

R> install.packages(c("seqinr","gplots","corrplot","optparse","data.table","epiR"))

B) Install BioConductor library dependencies (open R)

R> source("http://bioconductor.org/biocLite.R")
R> biocLite(c("limma","affy"))

2) Install SafeQuant from sources

Option 1, install "master branch" using "devtools"

Make sure you have a working development environment.

Windows: Install Rtools.

Mac: Install Xcode from the Mac App Store.

Linux: Install a compiler and various development libraries (details vary across different flavors of Linux).

R> install.packages("devtools")
R> library(devtools)
R> install_github("eahrne/SafeQuant")

Option 2, install latest CRAN version

R> install.packages("SafeQuant")

3) To run safeQuant.R (Post-process Progenesis LFQ datasets or Scaffold TMT datasets)

A) locate file safeQuant.R (C:\Users\ahrnee-adm\Downloads\SafeQuant\exec\safeQuant.R ) This is the SafeQuant main script. Copy it to an appropriate directory, e.g. c:\Program Files\SafeQuant\

B) open terminal To display help options

> Rscript "c:\Program Files\SafeQuant\safeQuant.R" -h

To run (with minimal arguments)

> Rscript "c:\Program Files\SafeQuant\safeQuant.R" -i "c:\Program Files\SafeQuant\testData\peptide_measurement.csv" -o "c:\Program Files\SafeQuant\out"

Tips

I) If using Progenesis QI we advice running SafeQuant on "Peptide Measurement" Exports.

  • File -> Export Peptide Measurements. This option is available once you have reached the "Resolve Conflicts" Step in Progenesis QI
  • When choosing properties to be included in the exported file check the "Grouped accessions (for this sequence)" check box.

II) When working with Progenesis "Feature Exports" it is advisable to discard all features (rows) not annotated with a peptide, to speed up SafeQuant analysis. This can be done using the "filterLargeProgenesisPeptideFile.pl" perl script. (C:\Users\ahrnee-adm\Downloads\SafeQuant\exec\filterLargeProgenesisPeptideFile.pl)

A) install perl (or activePerl for windows http://www.activestate.com/activeperl)

B) open terminal

> perl "C:\Program Files\SafeQuant\filterLargeProgenesisPeptideFile.pl" "C:\Program Files\SafeQuant\testData\features.csv"

This will create a new versions of the feature file called with the extension "_FILTERED" features.csv -> features_FILTERED.csv

Basic functionality of the safeQuant.R script

  1. Data Normalization
    • LFQ
      • Global data normalization by equalizing the total MS1 peak areas across all LC/MS runs.
    • Isobaric Labeling experiments (TMT or iTRAQ)
      • Global data normalization by equalizing the total reporter ion intensities across all reporter ion channels.
  2. Ratio Calculation
    • LFQ
      • Summation of MS1 peak areas per peptide/protein and LC-MS/MS run, followed by calculation of peptide/protein abundance ratios.
    • Isobaric Labeling experiments (TMT or iTRAQ)
      • Summation of reporter ion intensities per peptide/protein and LC-MS/MS run, followed by calculation of peptide/protein abundance ratios.
  3. Statistical testing for differential abundances
    • The summarized peptide/protein expression values are used for statistical testing of between condition differentially abundant peptides/proteins. Here, empirical Bayes moderated t-tests is applied, as implemented in the R/Bioconductor limma package (Smyth, 2004). The resulting per protein and condition comparison p-values are subsequently adjusted for multiple testing using the Benjamini-Hochberg method.

Smyth, G. K. (2004). Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol, 3 SP -, Article3. http://www.ncbi.nlm.nih.gov/pubmed/16646809

Use Case Manual

https://raw.githubusercontent.com/eahrne/SafeQuant/master/inst/manuals/SafeQuant_UseCases.txt

.tsv export help

https://github.com/eahrne/SafeQuant/blob/master/inst/manuals/tsv_spreadsheet_help.pdf

Package Documentation

https://github.com/eahrne/SafeQuant/blob/master/inst/manuals/SafeQuant-man.pdf

Publications

  • Ahrne, E. et al. Evaluation and Improvement of Quantification Accuracy in Isobaric Mass Tag-Based Protein Quantification Experiments. J Proteome Res 15, 2537–2547 (2016). https://www.ncbi.nlm.nih.gov/pubmed/27345528
  • Ahrne, E., Molzahn, L., Glatter, T., & Schmidt, A. (2013). Critical assessment of proteome-wide label-free absolute abundance estimation strategies. Proteomics. Journal of Proteome Research Just Accepted Manuscript https://www.ncbi.nlm.nih.gov/pubmed/23794183
  • Glatter, T., Ludwig, C., Ahrne, E., Aebersold, R., Heck, A. J. R., & Schmidt, A. (2012). Large-scale quantitative assessment of different in-solution protein digestion protocols reveals superior cleavage efficiency of tandem Lys-C/trypsin proteolysis over trypsin digestion. https://www.ncbi.nlm.nih.gov/pubmed/23017020

questions?

Copy Link

Version

Install

install.packages('SafeQuant')

Monthly Downloads

18

Version

2.3

License

GPL-3

Maintainer

Erik Ahrne

Last Published

October 4th, 2016

Functions in SafeQuant (2.3)

plotAbsEstCalibrationCurve

Plot absolut Estimation calibration Curve
plotAdjustedVsNonAdjustedRatio

Plot adjusted tmt ratios vs original ratios
plotVolcano

Plots volcano, data points colored by max cv of the 2 compared conditions
plotXYDensity

Scatter plot with density coloring
safeQuantAnalysis

safeQunat s3 class
setNbPeptidesPerProtein

Set nbPeptides coulmn of featureData
createExpDesign

Create Experimental Design
COLORS

color vector
pairsAnnot

Plot lower triangle Pearsons R^2. Diagonal text, upper triangle all against all scatter plots with lm abline
getExpDesignProgenesisCsv

Parse Experimental Design from Progenesis Csv Export
getGlobalNormFactors

Get normalization factors. calculated as summed/median signal per run (column) over summed/median of first run.
getTopX

Calculate Mean of X most intense features
getUserOptions

Read User Specified Command Line Options
parseMaxQuantProteinGroupTxt

Parse MaxQuant Protein Group Txt
addScaffoldPTMFAnnotations

Add scaffold ptm annotaitons to tmt experiment
addIdQvalues

Add identification leve q-values to ExpressionSet (calculated based on target-decoy score distribution)
export

Export content of safeQuantAnalysis object
getAAProteinCoordinates

Get amino acid coordinates on protein
getMeanCenteredRange

Get modification coordinates on protein
getModifProteinCoordinates

Get modification coordinates on protein
getScoreCutOff

Get score cutoff for a given fdr cut-off
getSignalPerCondition

Summarize replicate signal per condition (min)
isStrippedACs

Check if ACs are in "non-stripped" uniprot format e.g. "sp|Q8CHJ2|AQP12_MOUSE"
maPlotSQ

ma-plot
plotExpDesign

Display experimental design, high-lighting the control condition
plotIdScoreVsFDR

Plot FDR vs. identification score
plotPrecMassErrorVsScore

Plot precursorMass error v.s score highlighting decoy and displaying user specified user specified precursor mass filter
plotQValueVsPValue

Plot qValue vs pValue
purityCorrectTMT

Correct channel intensities based on Reporter ion Isotopic Distributions
removeOutliers

Set value to NA if it deviatves with more than 1.5 * IQR from lower/upper quantile
expDesignTagToExpDesign

Create experimental design data.frame from user input string
cvBoxplot

C.V. boxplot
getNbMisCleavages

Get number of mis-cleavages perp peptide
getNbPeptidesPerProtein

Get number of peptides per protein
getNbSpectraPerProtein

Get number of spectra per protein
getPeptides

Digest protein
isCon

Check if protein is a contaminant entry
parseScaffoldRawFile

Parse scaffold output .xls file (RAW export)
perFeatureNormalization

Per Feature Normalization
isDecoy

Check if protein is a decoy entry
standardise

Standardise data
stripACs

strip uniprot format e.g. "sp|Q8CHJ2|AQP12_MOUSE" -> Q8CHJ2
getBaselineIntensity

Get signal at zscore x (x standard deviations below mean)
getCV

Calculate Coefficiant of Variance per feature (Relative standard Deviation)
getMotifX

Create motif-x peptide annotation
getNbDetectablePeptides

Get number peptides passing defined length criteria
getRatios

Calculate ratios, comparing all case to control
getRTNormFactors

Get retentiontime base normalization factors
parseProgenesisProteinCsv

Parse Progenesis Protein Csv
parseScaffoldPTMReport

Parse scaffold PTM Spectrum Report
plotNbValidDeFeaturesPerFDR

Plot Total Number of diffrentially Abundant Features (applying ratio cutoff) vs. qValue/pValue for all conditions
plotPrecMassErrorDistrib

Plot Precursor Mass Error Distribution
rollUp

Roll up feature intensites per unique colum combination
rtNormalize

Normalization data per retention time bin
barplotMSSignal

Barplot of ms-signal per column
calibrationCurve

S3 class object describing a calibration curve and storing some figures of merit
getIBAQEset

Calculate intensity-based absolute-protein-quantification (iBAQ) metric per protein
getIdLevelQvals

Calculates identification level q-values based on target-decoy score distributions
getImpuritiesMatrix

Get Thermo TMT impurity matrix
createExpressionDataset

Create ExpressionSet object
getIntSumPerProtein

Sum up raw intensities per protein and channel. keep track of number of summed spectra and unique peptides
globalNormalize

Normalize, Norm factors calculated as median signal per run (column) over median of first run.
hClustHeatMap

Hierarchical clustering heat map, cluster by runs intensity, features by ratio and display log2 ratios to control median
missinValueBarplot

Plot Percentage of Features with with missing values
option_list

Command Line Option List
plotRTNorm

Plot all retention time profile overalying ratios
plotROC

Plot Number of Identifications vs. FDR
setNbSpectraPerProtein

Set nbPeptides coulmn of featureData
sqNormalize

Normalize
createPairedExpDesign

Create Paired Expdesign
getAllCV

Calculate Coefficiant of Variance per feature (Relative standard Deviation) per Condition
getAllEBayes

Perform statistical test (mderated t-test), comparing all case to control
getLoocvFoldError

Leave-One-Out Cross Validate Qunatification Model
getMaxIndex

get index of max in vecotr of numeric values
parseProgenesisPeptideMeasurementCsv

Parse Progenesis Peptide Measurement Csv Export
parseProgenesisFeatureCsv

Parse Progenesis Feature Csv Export
plotMSSignalDistributions

Plot ms.signal distributions
plotNbIdentificationsVsRT

Plot the number of identified Features per Reteintion Time minute.
plotRTNormSummary

Plot all retention time normalization profiles
plotScoreDistrib

Plot identifications target decoy distribution