Learn R Programming

ume (version 1.5.2)

calc_norm_int: Calculate Normalized Peak Intensities

Description

Computes normalized peak intensities for a molecular formula dataset and adds the results as additional columns to the input data.table (mfd). It also calculates:

  • the number of molecular formula assignments per peak (n_assignments)

  • the total occurrences of each formula across the dataset (n_occurrence)

Normalized intensities are stored in a new column norm_int, and the reference intensity used for normalization is stored in int_ref.

Supported normalization methods:

  • "none" – no normalization; raw peak intensities are copied to norm_int

  • "bp" – normalized to the base peak intensity per spectrum

  • "sum" – normalized by the total sum of intensities per spectrum

  • "sum_ubiq" – normalized by the sum of intensities of ubiquitous peaks across the dataset

  • "sum_rank" – normalized by the sum of the top n_rank most intense peaks per spectrum

  • "euc" – Euclidean normalization (optional, not implemented in current version)

Usage

calc_norm_int(
  mfd,
  ms_id = "file_id",
  peak_id = "peak_id",
  peak_magnitude = "i_magnitude",
  normalization = c("bp", "sum", "sum_ubiq", "sum_rank", "none"),
  n_rank = 200,
  verbose = FALSE,
  ...
)

Value

A data.table identical to mfd but with additional columns:

norm_int

Normalized peak intensity based on selected method.

int_ref

Reference intensity used for normalization (e.g., sum, base peak).

n_assignments

Number of formula assignments per peak (calculated internally).

n_occurrence

Number of occurrences of each formula across all spectra (calculated internally).

Arguments

mfd

data.table with molecular formula data as derived from ume::assign_formulas. Column names of elements/isotopes must match names in the isotope column of ume::masses; values are integers representing counts per formula.

ms_id

Character; name of the column identifying individual spectra (default: "file_id").

peak_id

Character; name of the column identifying unique peaks (default: "peak_id").

peak_magnitude

Character; name of the column containing peak intensity values (default: "i_magnitude").

normalization

Character; normalization method to apply. One of "bp", "sum", "sum_ubiq", "sum_rank", "none". Default is "bp".

n_rank

Integer; number of top-ranked peaks to use for "sum_rank" normalization (default: 200).

verbose

logical; if TRUE, show progress messages.

...

Additional arguments (currently unused).

See Also

Other calculations: calc_data_summary(), calc_dbe(), calc_eval_params(), calc_exact_mass(), calc_ideg(), calc_ma(), calc_neutral_mass(), calc_nm(), calc_number_assignment(), calc_number_occurrence(), calc_recalibrate_ms()

Examples

Run this code
mfd_norm <- calc_norm_int(
  mfd = mf_data_demo,
  normalization = "sum_ubiq"
)

Run the code above in your browser using DataLab