A highly efficient reading of a tab-separated text file for iq processing.
fast_read(filename,
sample_id = "R.Condition",
primary_id = "PG.ProteinGroups",
secondary_id = c("EG.ModifiedSequence", "FG.Charge", "F.FrgIon", "F.Charge"),
intensity_col = "F.PeakArea",
annotation_col = c("PG.Genes", "PG.ProteinNames"),
filter_string_equal = c("F.ExcludedFromQuantification" = "False"),
filter_string_not_equal = NULL,
filter_double_less = c("PG.Qvalue" = "0.01", "EG.Qvalue" = "0.01"),
filter_double_greater = NULL,
intensity_col_sep = NULL,
intensity_col_id = NULL,
na_string = "0")
A list is returned with following components
A table of proteins in the first column followed by annotation columns.
A vector of samples.
A vector of fragment ions to be used for quantification.
A list of four components: protein_list (index pointing to protein
)), sample_list (index pointing to sample
), id (index pointing to ion
), and quant (intensities).
A long-format tab-separated text file with a primary column of protein identification, secondary columns of fragment ions, a column of sample names, a column for quantitative intensities, and extra columns for annotation.
Unique values in this column form the list of proteins to be quantified.
A concatenation of these columns determines the fragment ions used for quantification.
Unique values in this column form the list of samples.
The column for intensities.
Annotation columns
A named vector of strings. Only rows satisfying the condition are kept.
A named vector of strings. Only rows satisfying the condition are kept.
A named vector of strings. Only rows satisfying the condition are kept. Default PG.Qvalue < 0.01 and EG.Qvalue < 0.01.
A named vector of strings. Only rows satisfying the condition are kept.
A separator character when entries in the intensity column contain multiple values.
The column for identities of multiple quantitative values.
The value considered as NA.
Thang V. Pham
When entries in the intensity column contain multiple values, this function will replicate entries in other column and the secondary_id
will be appended with corresponding entries in intensity_col_id
when it is provided. Otherwise, integer values 1, 2, 3, etc... will be used.
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.