Learn R Programming

⚠️There's a newer version (1.1.2) of this package.Take me there.

PTXQC

This package allows users of MaxQuant to generate quality control reports in Html/PDF format.

Latest changes / Change log

**Latest: **

  • v0.82.06 - Jun 2017: Special contaminants fix (issue #34)
  • v0.82.05 - May 2017: More robust installation of GitHub package
  • v0.82.04 - Apr 2017: Fix crash with MaxQuant 1.2 input (issue #33)

See NEWS file for a version history.

Platform support

  • Windows (recommended for convenience to make use of the drag'n'drop batch file provided)
  • Linux
  • MacOSX

Features

  • plethora of quality metrics
    • intensity distributions
    • digestion efficiency
    • contaminant visualizations
    • identification performance
    • Match-between-runs performance
  • easy usage ([Windows OS only] drag'n'drop your txt output folder onto a batch file)
  • Html/PDF report will be generated within your MaxQuant-txt folder
  • optional configuration file in YAML format for generation of shorter/customized reports

Target audience

  • MaxQuant users (no knowledge of R required)
  • bioinformaticians (who want to contribute or customize)

Documentation

Besides this documentation on GitHub, the package vignettes of PTXQC will give you valuable information. After the package is installed (see below), you can browse the vignettes using either of these commands within R:

help(package="PTXQC")
browseVignettes(package = 'PTXQC')

If you do not want to wait that long, have a look at the 'vignettes' subfolder. The top part contains a small table with technical gibberish, but the rest is identical to the vignettes you would see in R.

You will find documentation on

  • Input and Output
  • Report customization
  • (for MaxQuant users) Usage of Drag'n'drop
  • (for R users) code examples in R

Installation

If you want to generate QC reports without actually getting involved in R:

We offer a Batch-file based Drag'n'drop mechanism to trigger PTXQC on any MaxQuant output folder. This only works for Windows (not Linux or MacOS) at the moment -- but you have a Windows anyway to run MaxQuant, right?! See drag'n'drop for details. It takes 10 minutes and you are done!

If you just want the package to use (and maybe even modify) it:

First, install pandoc (see bottom of linked page). Pandoc is required in order to locally build the package vignettes (documentation), but you can also read the vignettes online from the PTXQC GitHub page. More importantly, Pandoc enables PTXQC to write QC reports in HTML format (which come with a help text for each plot and are interactive). PDF reports only contain plots! The reports are printed as PDF by default and additionally as HTML if Pandoc is found. If you install Pandoc later while your R session is already open, you need to close and re-open R in order to make R aware of Pandoc!

You can grab PTXQC from either CRAN or GitHub. GitHub installation will give you the latest package; the CRAN version might be a little older, but is faster to install. Check the NEWS file for CRAN submissions and version.

For the code blocks below: Run each line separately in your R console, i.e. do not copy and paste the whole block. If an error should occur, this allows to track it down more easily. See FAQ - Installation how to resolve them.

## CRAN
install.packages("PTXQC")

or

## GitHub
if (!require(devtools, quietly = TRUE)) install.packages("devtools")
library("devtools")             ## this might give a warning like 'WARNING: Rtools is required ...'. Ignore it.

## use build_vignettes = FALSE if you did not install pandoc or if you encounter errors when building vignettes (e.g. PRIDE ftp unavailable)!
install_github("cbielow/PTXQC", build_vignettes = TRUE, dependencies = TRUE)

To get started, see the help and/or vignettes:

help(package="PTXQC")
browseVignettes(package = 'PTXQC')

Please feel free to report bugs (see below), or issue pull requests!

Report Examples

An overview chart at the beginning of the report will give you a first impression. Detailed plots can be found in the remainder of each report.

For example input data and full reports, see the 'inst/examples' subfolder.

Bug reporting / Feature requests

If you encounter a bug, e.g. error message, wrong figures, missing axis annotation or anything which looks suspicious, please use the GitHub issue tracker and file a report.

You should include

  • stage you encounter the bug, e.g. during installation, report creation, or after report creation (i.e. a bug in the report itself).
  • PDF/Html report itself (if one was generated).
  • version of PTXQC, e.g. see the report_XXX.pdf/html (where XXX will be the version) or see the DESCRIPTION file of the PTXQC package or call help(package="PTXQC") within R
  • error message (very important!). Either copy it or provide a screen shot.

Please be as precise as possible when providing the bug report - just imagine what kind of information you would like to have in order to track down the issue. In certain situations, the whole txt-folder or a single MaxQuant file might be helpful to solve the problem.

If you want to see a new metric, or have ideas how to improve the existing ones, just open an issue ticket and leave a description.

Citation

PTXQC is published at JPR:

Proteomics Quality Control: Quality Control Software for MaxQuant Results Chris Bielow, Guido Mastrobuoni, and Stefan Kempa J. Proteome Res., 2016, 15 (3), pp 777–787. DOI: 10.1021/acs.jproteome.5b00780

Use PTXQC v0.69.3 if you want the version which was used in the paper, i.e. use install_github(..., ref="v0.69.3") when following the Installation procedure.

The input data is available in the 'inst/examples' subfolder.

We recommend to use the most recent PTXQC for the best user experience.

Copy Link

Version

Install

install.packages('PTXQC')

Monthly Downloads

485

Version

0.82.6

License

GPL-2

Issues

Pull Requests

Stars

Forks

Maintainer

Chris Bielow

Last Published

June 9th, 2017

Functions in PTXQC (0.82.6)

CV

Coefficient of variation (CV)
LCS

Compute longest common substring of two strings.
RTalignmentTree

Return a tree plot with a possible alignment tree.
YAMLClass-class

Query a YAML object for a certain parameter.
addGGtitle

Add title and subtitle to a ggplot
LCSn

Find longest common substring from 'n' strings.
MQDataReader$getInvalidLines

Detect broken lines (e.g. due to Excel import+export)
MQDataReader$writeMappingFile

Writes a mapping table of full Raw file names to shortened names.
RSD

Relative standard deviation (RSD)
byXflex

Same as byX, but with more flexible group size, to avoid that the last group has only a few entries (<50% of desired size).
computeMatchRTFractions

Combine several data structs into a final picture for segmentation incurred by 'Match-between-runs'.
MQDataReader$readMappingFile

Reads a mapping table of full Raw file names to shortened names.
MQDataReader$substitute

Replaces values in the mq.data member with (binary) values.
alignmentCheck

Verify an alignment by checking the retention time differences of identical peptides across Raw files
brewer.pal.Safe

Return color brew palettes, but fail hard if number of requested colors is larger than the palette is holding.
byX

Calls FUN on a subset of data in blocks of size 'subset_size' of unique indices.
getFragmentErrors

Extract fragment mass deviation errors from a data.frame from msms.txt
MQDataReader$getShortNames

Shorten a set of Raw file names and return a data frame with the mappings.
MQDataReader$new

Constructor for class 'MQDataReader'.
getAbundanceClass

Assign a relative abundance class to a set of (log10) abundance values
getECDF

Estimate the empirical density and return it
grepv

Grep with values returned instead of indices.
idTransferCheck

Check how close transferred ID's after alignment are to their genuine IDs within one Raw file.
longestCommonSuffix

Like longestCommonPrefix(), but on the suffix.
mosaicize

Prepare a Mosaic plot of two columns in long format.
getMQPARValue

Retrieve a parameter value from a mqpar.xml file
ggText

Plot a text as graphic using ggplot2.
%+%

A string concatenation function, more readable than 'paste()'.
pasten

paste with newline as separator
getPCA

Create a principal component analysis (PCA) plot for the first two dimensions.
getPeptideCounts

Extract the number of peptides observed per Raw file from an evidence table.
getReportFilenames

Assembles a list of output file names, which will be created during reporting.
ggAxisLabels

Function to thin out the number of labels shown on an axis in GGplot
appendEnv

Add the value of a variable to an environment (fast append)
del0

Replace 0 with NA in a vector
delLCP

Removes the longest common prefix (LCP) from a vector of strings.
getMaxima

Find the local maxima in a vector of numbers.
plot_IDsOverRT

Plot IDs over time for each Raw file.
plot_IonInjectionTimeOverRT

Plot line graph of TopN over Retention time.
plot_MBRAlign

Plot MaxQuant Match-between-runs alignment performance.
MQDataReader$plotNameMapping

Plots the current mapping of Raw file names to their shortened version.
MQDataReader$readMQ

Wrapper to read a MQ txt file (e.g. proteinGroups.txt).
assignBlocks

Assign set numbers to a vector of values.
boxplotCompare

Boxplots - one for each condition (=column) in a data frame.
pastet

paste with tab as separator
plotTable

Plot a table with row names and title
plotTableRaw

Colored table plot.
plot_ContUserScore

Plot Andromeda score distribution of contaminant peptide vs. matrix peptides.
plot_ContsPG

Plot contaminants from proteinGroups.txt
delLCS

Removes the longest common suffix (LCS) from a vector of strings.
findAlignReference

Return list of raw file names which were reported by MaxQuant as reference point for alignment.
fixCalibration

Detect (and fix) MaxQuant mass recalibration columns, since they sometimes report wrong values.
flattenList

Flatten lists of lists with irregular depths to just a list of items, i.e. a list of the leaves (if you consider the input as a tree).
inMatchWindow

For grouped peaks: separate them into in-width vs. out-width class.
lcpCount

Count the number of chars of the longest common prefix
plot_ContEVD

Plot contaminants from evidence.txt, broken down into top5-proteins.
plot_ContUser

Plot user-defined contaminants from evidence.txt
plot_CalibratedMSErr

Plot bargraph of uncalibrated mass errors for each Raw file.
plot_Charge

Plot MaxQuant Match-between-runs id transfer performance.
plot_RTPeakWidth

Plot RT peak width over time
plot_RatiosPG

Plot ratios of labeled data (e.g. SILAC) from proteinGroups.txt
plot_MBRgain

Plot MaxQuant Match-between-runs id transfer performance.
plot_MS2Decal

Plot bargraph of oversampled 3D-peaks.
qualCentered

Quality metric for 'centeredness' of a distribution around zero.
qualCenteredRef

Quality metric for 'centeredness' of a distribution around zero with a user-supplied range threshold.
qualLinThresh

Quality metric with linear response to input, reaching the maximum score at the given threshold.
qualMedianDist

Quality metric which measures the absolute distance from median.
plot_MBRIDtransfer

Plot MaxQuant Match-between-runs id transfer performance.
plot_TopNoverRT

Plot line graph of TopN over Retention time.
plot_UncalibratedMSErr

A boxplot of uncalibrated mass errors for each Raw file.
qualUniform

Compute deviation from uniform distribution
pointsPutX

Distribute a set of points with fixed y-values on a stretch of the x-axis.
print.PTXQC_table

helper S3 class, enabling print(some-plot_Table-object)
qualGaussDev

Compute probability of Gaussian (mu=m, sd=s) at a position 0, with reference to the max obtainable probability of that Gaussian at its center.
qualHighest

Score an empirical density distribution of values, where the best possible distribution is right-skewed.
shortenStrings

Shorten a string to a maximum length and indicate shorting by appending '..'
read.MQ

Convenience wrapper for MQDataReader when only a single MQ file should be read and file mapping need not be stored.
supCount

Compute shortest prefix length which makes all strings in a vector uniquely identifyable.
theme_blank

A blank theme (similar to the deprecated theme_blank())
simplifyNames

Removes common substrings (infixes) in a set of strings.
thinOut

Thin out a data.frame by removing rows with similar numerical values in a certain column.
thinOutBatch

Apply 'thinOut' on all subsets of a data.frame, split by a batch column
getMetaData

Extract meta information (orderNr, metric name, category) from a list of Qc metric objects
lcsCount

Count the number of chars of the longest common suffix
longestCommonPrefix

Get the longest common prefix from a set of strings.
plot_ScanIDRate

Plot line graph of TopN over Retention time.
plot_TopN

Plot line graph of TopN over Retention time.
qcMetric_MSMSScans_TopNoverRT-class

Metric for msmsscans.txt, showing TopN over RT.
qualBestKS

From a list of vectors, compute all vs. all Kolmogorov-Smirnoff distance statistics (D)
scale_x_discrete_reverse

Inverse the order of items on the x-axis (for discrete scales)
scale_y_discrete_reverse

Inverse the order of items on the y-axis (for discrete scales)
ScoreInAlignWindow

Compute the fraction of features per Raw file which have an acceptable RT difference after alignment
correctSetSize

Re-estimate a new set size to split a number of items into equally sized sets.
createReport

Create a quality control report (in PDF format).
getProteinCounts

Extract the number of protein groups observed per Raw file from an evidence table.
getQCHeatMap

Generate a Heatmap from a list of QC measurements.
peakSegmentation

Determine fraction of evidence which causes segmentation, i.e. sibling peaks at different RTs confirmed either by genuine or transferred MS/MS.
peakWidthOverTime

Discretize RT peak widths by averaging values per time bin.
wait_for_writable

Check if a file is writable and blocks an interactive session, waiting for user input.
plot_CountData

Plot Protein groups per Raw file
plot_IDRate

Plot percent of identified MS/MS for each Raw file.
plot_MS2Oversampling

Plot bargraph of oversampled 3D-peaks.
plot_MissedCleavages

Plot bargraph of missed cleavages.
printWithFooter

Augment a ggplot with footer text
qcMetric-class

Class which can compute plots (usually for a single metric).
renameFile

Given a vector of (short/long) filenames, translate to the (long/short) version
repEach

Repeat each element x_i in X, n_i times.