Prepares a long-format input including removing low-intensity ions and performing median normalization.
preprocess(quant_table,
primary_id = "PG.ProteinGroups",
secondary_id = c("EG.ModifiedSequence", "FG.Charge", "F.FrgIon", "F.Charge"),
sample_id = "R.Condition",
intensity_col = "F.PeakArea",
median_normalization = TRUE,
log2_intensity_cutoff = 0,
pdf_out = "qc-plots.pdf",
pdf_width = 12,
pdf_height = 8,
intensity_col_sep = NULL,
intensity_col_id = NULL,
na_string = "0",
show_boxplot = TRUE)
A data frame is returned with following components
A vector of proteins.
A vector of samples.
A vector of fragment ions to be used for quantification.
A vector of log2 intensities.
A long-format table with a primary column of protein identification, secondary columns of fragment ions, a column of sample names, and a column for quantitative intensities.
Unique values in this column form the list of proteins to be quantified.
A concatenation of these columns determines the fragment ions used for quantification.
Unique values in this column form the list of samples.
The column for intensities.
A logical value. The default TRUE
value is to perform median normalization.
Entries lower than this value in log2 space are ignored. Plot a histogram of all intensities to set this parameter.
A character string specifying the name of the PDF output. A NULL
value will suppress the PDF output.
Width of the pdf output in inches.
Height of the pdf output in inches.
A separator character when entries in the intensity column contain multiple values.
The column for identities of multiple quantitative values.
The value considered as NA.
A logical value. The default TRUE
value is to create boxplots of fragment intensities for each sample.
Thang V. Pham
When entries in the intensity column contain multiple values, this function will replicate entries in other column and the secondary_id
will be appended with corresponding entries in intensity_col_id
when it is provided. Otherwise, integer values 1, 2, 3, etc... will be used.
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.
# \donttest{
data("spikeins")
head(spikeins)
# This example set of spike-in proteins has been 'median-normalized'.
norm_data <- iq::preprocess(spikeins, median_normalization = FALSE, pdf_out = NULL)
# }
Run the code above in your browser using DataLab