Collapses rows with identical values in a particular column in a table. When the values in each row are proportional such as intensities of multiple fragments of a protein, the MaxLFQ algorithm is recommended.
process_wide_format(input_filename,
output_filename,
id_column,
quant_columns,
data_in_log_space = FALSE,
annotation_columns = NULL,
method = "maxLFQ")
The result table is written to output_filename
. A NULL
value is returned.
Input filename of a tab-separated value text file.
Output filename.
The column where unique values will be kept. Rows with identical values in this column are merged. Rows with empty values here are removed.
Columns containing numerical data to be merged.
A logical value. If FALSE
, the numerical data will be log2-transformed.
Columns in the input file apart from id_column
and quant_columns
that will be kept in the output.
Method for merging. Default value is "maxLFQ"
. Possible values are "maxLFQ"
, "maxLFQ_R"
, "median_polish"
, "top3"
, "top5"
, "meanInt"
, "maxInt"
, "sum"
, "least_na"
and any function for collapsing a numerical matrix to a row vector.
Thang V. Pham
Method "maxLFQ_R"
implements the MaxLFQ algorithm pure R. It is slower than "maxLFQ"
.
Method "maxInt"
selects row with maximum intensity (top 1).
Method "sum"
sum all intensities.
Method "least_na"
selects row with the least number of missing values.
The value of method
can be a function such as function(x) log2(colSums(2^x, na.rm = TRUE))
for summing all intensities in the original space.
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.