Learn R Programming

The SafeQuant Package includes methods for analysis of quantitative (LFQ,TMT,HRM) Proteomics data.

Installation

1) Install Dependencies

A) Install CRAN library dependencies (open R)

R> install.packages(c("seqinr","gplots","corrplot","optparse","data.table","epiR"))

B) Install BioConductor library dependencies (open R)

R> source("http://bioconductor.org/biocLite.R")
R> biocLite(c("limma","affy"))

2) Install SafeQuant from sources

Option 1, install "master branch" using "devtools"

Make sure you have a working development environment.

Windows: Install Rtools.

Mac: Install Xcode from the Mac App Store.

Linux: Install a compiler and various development libraries (details vary across different flavors of Linux).

R> install.packages("devtools")
R> library(devtools)
R> install_github("eahrne/SafeQuant")

Option 2, install latest CRAN version

R> install.packages("SafeQuant")

3) To run safeQuant.R (Post-process Progenesis LFQ datasets or Scaffold TMT datasets)

A) locate file safeQuant.R (C:\Users\ahrnee-adm\Downloads\SafeQuant\exec\safeQuant.R ) This is the SafeQuant main script. Copy it to an appropriate directory, e.g. c:\Program Files\SafeQuant\

B) open terminal To display help options

> Rscript "c:\Program Files\SafeQuant\safeQuant.R" -h

To run (with minimal arguments)

> Rscript "c:\Program Files\SafeQuant\safeQuant.R" -i "c:\Program Files\SafeQuant\testData\peptide_measurement.csv" -o "c:\Program Files\SafeQuant\out"

Tips

I) If using Progenesis QI we advice running SafeQuant on "Peptide Measurement" Exports.

  • File -> Export Peptide Measurements. This option is available once you have reached the "Resolve Conflicts" Step in Progenesis QI
  • When choosing properties to be included in the exported file check the "Grouped accessions (for this sequence)" check box.

II) When working with Progenesis "Feature Exports" it is advisable to discard all features (rows) not annotated with a peptide, to speed up SafeQuant analysis. This can be done using the "filterLargeProgenesisPeptideFile.pl" perl script. (C:\Users\ahrnee-adm\Downloads\SafeQuant\exec\filterLargeProgenesisPeptideFile.pl)

A) install perl (or activePerl for windows http://www.activestate.com/activeperl)

B) open terminal

> perl "C:\Program Files\SafeQuant\filterLargeProgenesisPeptideFile.pl" "C:\Program Files\SafeQuant\testData\features.csv"

This will create a new versions of the feature file called with the extension "_FILTERED" features.csv -> features_FILTERED.csv

Basic functionality of the safeQuant.R script

  1. Data Normalization
    • LFQ
      • Global data normalization by equalizing the total MS1 peak areas across all LC/MS runs.
    • Isobaric Labeling experiments (TMT or iTRAQ)
      • Global data normalization by equalizing the total reporter ion intensities across all reporter ion channels.
  2. Ratio Calculation
    • LFQ
      • Summation of MS1 peak areas per peptide/protein and LC-MS/MS run, followed by calculation of peptide/protein abundance ratios.
    • Isobaric Labeling experiments (TMT or iTRAQ)
      • Summation of reporter ion intensities per peptide/protein and LC-MS/MS run, followed by calculation of peptide/protein abundance ratios.
  3. Statistical testing for differential abundances
    • The summarized peptide/protein expression values are used for statistical testing of between condition differentially abundant peptides/proteins. Here, empirical Bayes moderated t-tests is applied, as implemented in the R/Bioconductor limma package (Smyth, 2004). The resulting per protein and condition comparison p-values are subsequently adjusted for multiple testing using the Benjamini-Hochberg method.

Smyth, G. K. (2004). Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol, 3 SP -, Article3. http://www.ncbi.nlm.nih.gov/pubmed/16646809

Use Case Manual

https://raw.githubusercontent.com/eahrne/SafeQuant/master/inst/manuals/SafeQuant_UseCases.txt

.tsv export help

https://github.com/eahrne/SafeQuant/blob/master/inst/manuals/tsv_spreadsheet_help.pdf

Package Documentation

https://github.com/eahrne/SafeQuant/blob/master/inst/manuals/SafeQuant-man.pdf

Publications

  • Ahrne, E. et al. Evaluation and Improvement of Quantification Accuracy in Isobaric Mass Tag-Based Protein Quantification Experiments. J Proteome Res 15, 2537–2547 (2016). https://www.ncbi.nlm.nih.gov/pubmed/27345528
  • Ahrne, E., Molzahn, L., Glatter, T., & Schmidt, A. (2013). Critical assessment of proteome-wide label-free absolute abundance estimation strategies. Proteomics. Journal of Proteome Research Just Accepted Manuscript https://www.ncbi.nlm.nih.gov/pubmed/23794183
  • Glatter, T., Ludwig, C., Ahrne, E., Aebersold, R., Heck, A. J. R., & Schmidt, A. (2012). Large-scale quantitative assessment of different in-solution protein digestion protocols reveals superior cleavage efficiency of tandem Lys-C/trypsin proteolysis over trypsin digestion. https://www.ncbi.nlm.nih.gov/pubmed/23017020

questions?

Copy Link

Version

Install

install.packages('SafeQuant')

Monthly Downloads

8

Version

2.3.1

License

GPL-3

Maintainer

Erik Ahrne

Last Published

December 6th, 2016

Functions in SafeQuant (2.3.1)

getCV

Calculate Coefficiant of Variance per feature (Relative standard Deviation)
parseMaxQuantProteinGroupTxt

Parse MaxQuant Protein Group Txt
parseProgenesisFeatureCsv

Parse Progenesis Feature Csv Export
plotAdjustedVsNonAdjustedRatio

Plot adjusted tmt ratios vs original ratios
plotExpDesign

Display experimental design, high-lighting the control condition
plotPrecMassErrorDistrib

Plot Precursor Mass Error Distribution
plotPrecMassErrorVsScore

Plot precursorMass error v.s score highlighting decoy and displaying user specified user specified precursor mass filter
stripACs

strip uniprot format e.g. "sp|Q8CHJ2|AQP12_MOUSE" -> Q8CHJ2
getExpDesignProgenesisCsv

Parse Experimental Design from Progenesis Csv Export
getIntSumPerProtein

Sum up raw intensities per protein and channel. keep track of number of summed spectra and unique peptides
hClustHeatMap

Hierarchical clustering heat map, cluster by runs intensity, features by ratio and display log2 ratios to control median
getLoocvFoldError

Leave-One-Out Cross Validate Qunatification Model
isCon

Check if protein is a contaminant entry
expDesignTagToExpDesign

Create experimental design data.frame from user input string
export

Export content of safeQuantAnalysis object
getBaselineIntensity

Get signal at zscore x (x standard deviations below mean)
getAllEBayes

Perform statistical test (mderated t-test), comparing all case to control
getUserOptions

Read User Specified Command Line Options
globalNormalize

Normalize, Norm factors calculated as median signal per run (column) over median of first run.
option_list

Command Line Option List
pairsAnnot

Plot lower triangle Pearsons R^2. Diagonal text, upper triangle all against all scatter plots with lm abline
parseScaffoldRawFile

Parse scaffold output .xls file (RAW export)
parseScaffoldPTMReport

Parse scaffold PTM Spectrum Report
setNbSpectraPerProtein

Set nbPeptides coulmn of featureData
setNbPeptidesPerProtein

Set nbPeptides coulmn of featureData
createExpDesign

Create Experimental Design
createExpressionDataset

Create ExpressionSet object
getAAProteinCoordinates

Get amino acid coordinates on protein
getAllCV

Calculate Coefficiant of Variance per feature (Relative standard Deviation) per Condition
getMaxIndex

get index of max in vecotr of numeric values
getMeanCenteredRange

Get modification coordinates on protein
parseProgenesisPeptideMeasurementCsv

Parse Progenesis Peptide Measurement Csv Export
parseProgenesisProteinCsv

Parse Progenesis Protein Csv
plotIdScoreVsFDR

Plot FDR vs. identification score
plotMSSignalDistributions

Plot ms.signal distributions
plotRTNorm

Plot all retention time profile overalying ratios
plotRTNormSummary

Plot all retention time normalization profiles
rtNormalize

Normalization data per retention time bin
safeQuantAnalysis

safeQunat s3 class
createPairedExpDesign

Create Paired Expdesign
addIdQvalues

Add identification leve q-values to ExpressionSet (calculated based on target-decoy score distribution)
addScaffoldPTMFAnnotations

Add scaffold ptm annotaitons to tmt experiment
getIBAQEset

Calculate intensity-based absolute-protein-quantification (iBAQ) metric per protein
getGlobalNormFactors

Get normalization factors. calculated as summed/median signal per run (column) over summed/median of first run.
getModifProteinCoordinates

Get modification coordinates on protein
getSignalPerCondition

Summarize replicate signal per condition (min)
getMotifX

Create motif-x peptide annotation
cvBoxplot

C.V. boxplot
getIdLevelQvals

Calculates identification level q-values based on target-decoy score distributions
getImpuritiesMatrix

Get Thermo TMT impurity matrix
getNbSpectraPerProtein

Get number of spectra per protein
getPeptides

Digest protein
getNbPeptidesPerProtein

Get number of peptides per protein
maPlotSQ

ma-plot
getRatios

Calculate ratios, comparing all case to control
missinValueBarplot

Plot Percentage of Features with with missing values
plotQValueVsPValue

Plot qValue vs pValue
plotROC

Plot Number of Identifications vs. FDR
getTopX

Calculate Mean of X most intense features
isDecoy

Check if protein is a decoy entry
isStrippedACs

Check if ACs are in "non-stripped" uniprot format e.g. "sp|Q8CHJ2|AQP12_MOUSE"
perFeatureNormalization

Per Feature Normalization
plotAbsEstCalibrationCurve

Plot absolut Estimation calibration Curve
plotScoreDistrib

Plot identifications target decoy distribution
plotVolcano

Plots volcano, data points colored by max cv of the 2 compared conditions
sqNormalize

Normalize
standardise

Standardise data
purityCorrectTMT

Correct channel intensities based on Reporter ion Isotopic Distributions
plotXYDensity

Scatter plot with density coloring
plotNbIdentificationsVsRT

Plot the number of identified Features per Reteintion Time minute.
plotNbValidDeFeaturesPerFDR

Plot Total Number of diffrentially Abundant Features (applying ratio cutoff) vs. qValue/pValue for all conditions
getRTNormFactors

Get retentiontime base normalization factors
getScoreCutOff

Get score cutoff for a given fdr cut-off
barplotMSSignal

Barplot of ms-signal per column
COLORS

color vector
removeOutliers

Set value to NA if it deviatves with more than 1.5 * IQR from lower/upper quantile
rollUp

Roll up feature intensites per unique colum combination
getNbDetectablePeptides

Get number peptides passing defined length criteria
getNbMisCleavages

Get number of mis-cleavages perp peptide