Learn R Programming

⚠️There's a newer version (1.1.2) of this package.Take me there.

PTXQC

This package allows users of MaxQuant to generate quality control reports in Html/PDF format.

Latest changes / Change log

  • v0.92.06 - Apr 2019: Bug Fixes
  • v0.92.05 - Mar 2019: Raw name simplification fix
  • v0.92.04 - Feb 2019: More robust package vignette builds
  • v0.92.03 - Feb 2018: Full List of Metrics added as vignette
  • v0.92.02 - Jan 2018: plots and metrics of reporter intensity (iTRAQ, TMT, ...) for labeled MSn experiments
  • v0.92.01 - Oct 2017: fix issue #41 (partial data error)
  • v0.92.00 - Oct 2017: cleaner R interface; log file for drag'n'drop; fix boxPlots issue (usually for large experiments only);
  • v0.90.00 - Aug 2017: Tables are shown in Html format

See NEWS file for a version history.

Platform support

  • Windows (recommended for convenience to make use of the drag'n'drop batch file provided)
  • Linux
  • MacOSX

Features

  • plethora of quality metrics
    • intensity distributions
    • digestion efficiency
    • contaminant visualizations
    • identification performance
    • Match-between-runs performance
  • easy usage ([Windows OS only] drag'n'drop your txt output folder onto a batch file)
  • Html/PDF report will be generated within your MaxQuant-txt folder
  • optional configuration file in YAML format for generation of shorter/customized reports

Target audience

  • MaxQuant users (no knowledge of R required)
  • bioinformaticians (who want to contribute or customize)

Documentation

Besides this documentation on GitHub, the package vignettes of PTXQC will give you valuable information. After the package is installed (see below), you can browse the vignettes using either of these commands within R:

help(package="PTXQC")
browseVignettes(package = 'PTXQC')

If you do not want to wait that long, you can look at the latest online vignette at CRAN

You will find documentation on

  • Full List of Quality Metrics with help text
  • Input and Output
  • Report customization
  • (for MaxQuant users) Usage of Drag'n'drop
  • (for R users) Code examples in R

The 'List of Metrics' vignette contains a full description for each metric (as seen in the Help section of a Html report).

Installation

If you want to generate QC reports without actually getting involved in R:

We offer a Batch-file based Drag'n'drop mechanism to trigger PTXQC on any MaxQuant output folder. This only works for Windows (not Linux or MacOS) at the moment -- but you have a Windows anyway to run MaxQuant, right?! See drag'n'drop for details. It takes 10 minutes and you are done!

If you just want the package to use (and maybe even modify) it:

First, install pandoc (see bottom of linked page). Pandoc is required in order to locally build the package vignettes (documentation), but you can also read the vignettes online from the PTXQC GitHub page. More importantly, Pandoc enables PTXQC to write QC reports in HTML format (which come with a help text for each plot and are interactive). PDF reports only contain plots! The reports are printed as PDF by default and additionally as HTML if Pandoc is found. If you install Pandoc later while your R session is already open, you need to close and re-open R in order to make R aware of Pandoc!

You can grab PTXQC from either CRAN or GitHub. GitHub installation will give you the latest package; the CRAN version might be a little older, but is faster to install. Check the NEWS file for CRAN submissions and version.

For the code blocks below: Run each line separately in your R console, i.e. do not copy and paste the whole block. If an error should occur, this allows to track it down more easily. See FAQ - Installation how to resolve them.

## CRAN
install.packages("PTXQC")

or

## GitHub
if (!require(devtools, quietly = TRUE)) install.packages("devtools")
library("devtools")             ## this might give a warning like 'WARNING: Rtools is required ...'. Ignore it.

## use build_vignettes = FALSE if you did not install pandoc or if you encounter errors when building vignettes (e.g. PRIDE ftp unavailable)!
install_github("cbielow/PTXQC", build_vignettes = TRUE, dependencies = TRUE)

To get started, see the help and/or vignettes:

help(package="PTXQC")
browseVignettes(package = 'PTXQC')

Please feel free to report bugs (see below), or issue pull requests!

Report Examples

An overview chart at the beginning of the report will give you a first impression. Detailed plots can be found in the remainder of each report.

For example input data and full reports, see the 'inst/examples' subfolder.

Bug reporting / Feature requests

If you encounter a bug, e.g. error message, wrong figures, missing axis annotation or anything which looks suspicious, please use the GitHub issue tracker and file a report.

You should include

  • stage you encounter the bug, e.g. during installation, report creation, or after report creation (i.e. a bug in the report itself).
  • PDF/Html report itself (if one was generated).
  • version of PTXQC, e.g. see the report_XXX.pdf/html (where XXX will be the version) or see the DESCRIPTION file of the PTXQC package or call help(package="PTXQC") within R
  • error message (very important!). Either copy it or provide a screen shot.

Please be as precise as possible when providing the bug report - just imagine what kind of information you would like to have in order to track down the issue. In certain situations, the whole txt-folder or a single MaxQuant file might be helpful to solve the problem.

If you want to see a new metric, or have ideas how to improve the existing ones, just open an issue ticket and leave a description.

Citation

PTXQC is published at JPR:

Proteomics Quality Control: Quality Control Software for MaxQuant Results Chris Bielow, Guido Mastrobuoni, and Stefan Kempa J. Proteome Res., 2016, 15 (3), pp 777-787. DOI: 10.1021/acs.jproteome.5b00780

Use PTXQC v0.69.3 if you want the version which was used in the paper, i.e. use install_github(..., ref="v0.69.3") when following the Installation procedure.

The input data is available in the 'inst/examples' subfolder.

We recommend to use the most recent PTXQC for the best user experience.

Copy Link

Version

Install

install.packages('PTXQC')

Monthly Downloads

485

Version

0.92.6

License

GPL-2

Issues

Pull Requests

Stars

Forks

Maintainer

Chris Bielow

Last Published

April 16th, 2019

Functions in PTXQC (0.92.6)

MQDataReader$readMappingFile

Reads a mapping table of full Raw file names to shortened names.
MQDataReader$substitute

Replaces values in the mq.data member with (binary) values.
delLCS

Removes the longest common suffix (LCS) from a vector of strings.
findAlignReference

Return list of raw file names which were reported by MaxQuant as reference point for alignment.
fixCalibration

Detect (and fix) MaxQuant mass recalibration columns, since they sometimes report wrong values.
flattenList

Flatten lists of lists with irregular depths to just a list of items, i.e. a list of the leaves (if you consider the input as a tree).
LCSn

Find longest common substring from 'n' strings.
ggText

Plot a text as graphic using ggplot2.
MQDataReader$getInvalidLines

Detect broken lines (e.g. due to Excel import+export)
%+%

A string concatenation function, more readable than 'paste()'.
lcsCount

Count the number of chars of the longest common suffix
longestCommonPrefix

Get the longest common prefix from a set of strings.
MQDataReader$plotNameMapping

Plots the current mapping of Raw file names to their shortened version.
MQDataReader$readMQ

Wrapper to read a MQ txt file (e.g. proteinGroups.txt).
plot_CountData

Plot Protein groups per Raw file
RTalignmentTree

Return a tree plot with a possible alignment tree.
alignmentCheck

Verify an alignment by checking the retention time differences of identical peptides across Raw files
MQDataReader$getShortNames

Shorten a set of Raw file names and return a data frame with the mappings.
plot_IDRate

Plot percent of identified MS/MS for each Raw file.
MQDataReader$new

Constructor for class 'MQDataReader'.
ScoreInAlignWindow

Compute the fraction of features per Raw file which have an acceptable RT difference after alignment
byXflex

Same as byX, but with more flexible group size, to avoid that the last group has only a few entries (<50% of desired size).
appendEnv

Add the value of a variable to an environment (fast append)
brewer.pal.Safe

Return color brew palettes, but fail hard if number of requested colors is larger than the palette is holding.
plot_MBRgain

Plot MaxQuant Match-between-runs id transfer performance.
plot_MS2Decal

Plot bargraph of oversampled 3D-peaks.
computeMatchRTFractions

Combine several data structs into a final picture for segmentation incurred by 'Match-between-runs'.
byX

Calls FUN on a subset of data in blocks of size 'subset_size' of unique indices.
getAbundanceClass

Assign a relative abundance class to a set of (log10) abundance values
getMaxima

Find the local maxima in a vector of numbers.
getMQPARValue

Retrieve a parameter value from a mqpar.xml file
qcMetric_MSMSScans_TopNoverRT-class

Metric for msmsscans.txt, showing TopN over RT.
MQDataReader$writeMappingFile

Writes a mapping table of full Raw file names to shortened names.
qualBestKS

From a list of vectors, compute all vs. all Kolmogorov-Smirnoff distance statistics (D)
getECDF

Estimate the empirical density and return it
getProteinCounts

Extract the number of protein groups observed per Raw file from an evidence table.
assignBlocks

Assign set numbers to a vector of values.
getQCHeatMap

Generate a Heatmap from a list of QC measurements.
RSD

Relative standard deviation (RSD)
boxplotCompare

Boxplots - one for each condition (=column) in a data frame.
peakSegmentation

Determine fraction of evidence which causes segmentation, i.e. sibling peaks at different RTs confirmed either by genuine or transferred MS/MS.
getReportFilenames

Assembles a list of output file names, which will be created during reporting.
scale01linear

Scales a vector of values linearly to [0, 1] If all input values are equal, returned values are all 0
ggAxisLabels

Function to thin out the number of labels shown on an axis in GGplot
scale_x_discrete_reverse

Inverse the order of items on the x-axis (for discrete scales)
peakWidthOverTime

Discretize RT peak widths by averaging values per time bin.
longestCommonSuffix

Like longestCommonPrefix(), but on the suffix.
mosaicize

Prepare a Mosaic plot of two columns in long format.
plotTable

Plot a table with row names and title
getMetaData

Extract meta information (orderNr, metric name, category) from a list of Qc metric objects
getMetricsObjects

Get all currently available metrics
plot_MBRAlign

Plot MaxQuant Match-between-runs alignment performance.
plotTableRaw

Colored table plot.
plot_MBRIDtransfer

Plot MaxQuant Match-between-runs id transfer performance.
plot_RTPeakWidth

Plot RT peak width over time
del0

Replace 0 with NA in a vector
printWithFooter

Augment a ggplot with footer text
inMatchWindow

For grouped peaks: separate them into in-width vs. out-width class.
plot_RatiosPG

Plot ratios of labeled data (e.g. SILAC) from proteinGroups.txt
qcMetric-class

Class which can compute plots (usually for a single metric).
lcpCount

Count the number of chars of the longest common prefix
plot_ContUserScore

Plot Andromeda score distribution of contaminant peptide vs. matrix peptides.
qualLinThresh

Quality metric with linear response to input, reaching the maximum score at the given threshold.
delLCP

Removes the longest common prefix (LCP) from a vector of strings.
qualMedianDist

Quality metric which measures the absolute distance from median.
qualUniform

Compute deviation from uniform distribution
read.MQ

Convenience wrapper for MQDataReader when only a single MQ file should be read and file mapping need not be stored.
getFragmentErrors

Extract fragment mass deviation errors from a data.frame from msms.txt
plot_ContsPG

Plot contaminants from proteinGroups.txt
getHTMLTable

Create an HTML table with an extra header row
thinOutBatch

Apply 'thinOut' on all subsets of a data.frame, split by a batch column
wait_for_writable

Check if a file is writable and blocks an interactive session, waiting for user input.
getPCA

Create a principal component analysis (PCA) plot for the first two dimensions.
plot_ScanIDRate

Plot line graph of TopN over Retention time.
getPeptideCounts

Extract the number of peptides observed per Raw file from an evidence table.
plot_TopN

Plot line graph of TopN over Retention time.
qualGaussDev

Compute probability of Gaussian (mu=m, sd=s) at a position 0, with reference to the max obtainable probability of that Gaussian at its center.
qualHighest

Score an empirical density distribution of values, where the best possible distribution is right-skewed.
pasten

paste with newline as separator
theme_blank

A blank theme (similar to the deprecated theme_blank())
pastet

paste with tab as separator
plot_CalibratedMSErr

Plot bargraph of uncalibrated mass errors for each Raw file.
plot_Charge

The plots shows the charge distribution per Raw file. The output of 'mosaicize()' can be used directly.
plot_MS2Oversampling

Plot bargraph of oversampled 3D-peaks.
plot_MissedCleavages

Plot bargraph of missed cleavages.
pointsPutX

Distribute a set of points with fixed y-values on a stretch of the x-axis.
print.PTXQC_table

helper S3 class, enabling print(some-plot_Table-object)
thinOut

Thin out a data.frame by removing rows with similar numerical values in a certain column.
qualCentered

Quality metric for 'centeredness' of a distribution around zero.
qualCenteredRef

Quality metric for 'centeredness' of a distribution around zero with a user-supplied range threshold.
scale_y_discrete_reverse

Inverse the order of items on the y-axis (for discrete scales)
shortenStrings

Shorten a string to a maximum length and indicate shorting by appending '..'
CV

Coefficient of variation (CV)
LCS

Compute longest common substring of two strings.
YAMLClass-class

Query a YAML object for a certain parameter.
addGGtitle

Add title and subtitle to a ggplot
correctSetSize

Re-estimate a new set size to split a number of items into equally sized sets.
createReport

Create a quality control report (in PDF format).
grepv

Grep with values returned instead of indices.
idTransferCheck

Check how close transferred ID's after alignment are to their genuine IDs within one Raw file.
plot_ContEVD

Plot contaminants from evidence.txt, broken down into top5-proteins.
plot_IDsOverRT

Plot IDs over time for each Raw file.
plot_ContUser

Plot user-defined contaminants from evidence.txt
plot_IonInjectionTimeOverRT

Plot line graph of TopN over Retention time.
plot_TopNoverRT

Plot line graph of TopN over Retention time.
plot_UncalibratedMSErr

A boxplot of uncalibrated mass errors for each Raw file.
renameFile

Given a vector of (short/long) filenames, translate to the (long/short) version
repEach

Repeat each element x_i in X, n_i times.
simplifyNames

Removes common substrings (infixes) in a set of strings.
supCount

Compute shortest prefix length which makes all strings in a vector uniquely identifyable.