Quantification results form MaxQuant can be read using this function and relevant information extracted. Innput files compressed as .gz can be read as well. Besides protein abundance values (XIC) peptide counting information like number of unique razor-peptides or PSM values can be extracted, too. The protein abundance values mat be normalized using multiple methods (median normalization is default), the determination of normalization values can be restricted to specific proteins (normalization to bait protein(s), or to matrix in UPS1 spike-in experiments). Besides, a graphical display of the distruibution of protein abundance values may be generated.
readMaxQuantFile(
path,
fileName = "proteinGroups.txt",
normalizeMeth = "median",
quantCol = "LFQ.intensity",
contamCol = "Potential.contaminant",
pepCountCol = c("Razor + unique peptides", "Unique peptides", "MS.MS.count"),
uniqPepPat = NULL,
refLi = NULL,
extrColNames = c("Majority.protein.IDs", "Fasta.headers", "Number.of.proteins"),
specPref = c(conta = "conta|CON_|LYSC_CHICK", mainSpecies = "OS=Homo sapiens"),
remRev = TRUE,
separateAnnot = TRUE,
tit = NULL,
wex = 1.6,
plotGraph = TRUE,
silent = FALSE,
callFrom = NULL
)
(character) path of file to be read
(character) name of file to be read (default 'proteinGroups.txt' as typically generated by MaxQuant in txt folder). Gz-compressed files can be read, too.
(character) normalization method (for details see normalizeThis
)
(character or integer) exact col-names, or if length=1 content of quantCol
will be used as pattern to search among column-names for $quant using grep
(character or integer, length=1) which columns should be used for contaminants marked by ProteomeDiscoverer
(character) pattern to search among column-names for count data (1st entry for 'Razor + unique peptides', 2nd fro 'Unique peptides', 3rd for 'MS.MS.count' (PSM))
(character, length=1) depreciated, please use pepCountCol
instead
(character or integer) custom specify which line of data is main species, if character (eg 'mainSpe'), the column 'SpecType' in $annot will be searched for exact match of the (single) term given
(character) column names to be read (1: prefix for LFQ quantitation, default 'LFQ.intensity'; 2: column name for protein-IDs, default 'Majority.protein.IDs'; 3: column names of fasta-headers, default 'Fasta.headers', 4: column name for number of protein IDs matching, default 'Number.of.proteins')
(character) prefix to identifiers allowing to separate i) recognize contamination database, ii) species of main identifications and iii) spike-in species
(logical) option to remove all protein-identifications based on reverse-peptides
(logical) if TRUE
output will be organized as list with $annot
, $abund
for initial/raw abundance values and $quant
with final normalized quantitations
(character) custom title to plot
(numeric) relative expansion factor of the violin in plot
(logical) optional plot vioplot of initial and normalized data (using normalizeMeth
); alternatively the argument may contain numeric details that will be passed to layout
when plotting
(logical) suppress messages
(character) allow easier tracking of message produced
list with $raw
(initial/raw abundance values), $quant
with final normalized quantitations, $annot
(columns ), $counts
an array with 'PSM' and 'NoOfRazorPeptides', $quantNotes
and $notes
; or a data.frame with quantitation and annotation if separateAnnot=FALSE
This function has been developed using MaxQuant versions 1.6.10.x to 1.6.17.x, the format of resulting file 'proteinGroups.txt' is typically well conserved.
The final output is a list containing these elements: $raw
, $quant
, $annot
, $counts
, $quantNotes
, $notes
, or (if separateAnnot=FALSE
) data.frame
with annotation- and main quantification-content.
# NOT RUN {
path1 <- system.file("extdata", package="wrProteo")
# Here we'll load a short/trimmed example file (thus not the MaxQuant default name)
fiNa <- "proteinGroupsMaxQuant1.txt.gz"
specPr <- c(conta="conta|CON_|LYSC_CHICK", mainSpecies="YEAST",spike="HUMAN_UPS")
dataMQ <- readMaxQuantFile(path1, file=fiNa, specPref=specPr, tit="tiny MaxQuant")
summary(dataMQ$quant)
matrixNAinspect(dataMQ$quant, gr=gl(3,3))
# }
Run the code above in your browser using DataLab