Internal function used in the simulation of theoretical spectra. It calculates the isotopic distribution of an undeuterated peptide that is required to get an empirical distribution.
get_approx_isotopic_distribution(sequence, min_probability = 0.001)
character vector of amino acid sequence of a peptide
minimum isotopic probability that will be considered
list of elements: the mass of the peptide (peptide_mass
),
final distribution (isotopic_distribution
) of the isotopes,
number of significant probabilities minus one (max_ND
) and
number of exchangeable amino acids (n_exchangeable
).
Additional file sysdata.RDA
contains the maximal possible
occurrence of the isotopes C13, N15, O18, S34 (carbon, nitrogen, oxygen, and
sulfur, respectively) in the respective amino acids, and their masses. Based
on that, the maximal possible number of molecules of the isotopes in the
sequence is calculated. Peptide mass is the sum of the masses of amino acids
and H2O mass - as it includes the N terminal group (H) and C terminal group
(OH).
Next, the distributions of mentioned isotopes are calculated under the assumption that the occurrence of ith considered isotope has a binomial distribution B(n_i, p_i) with parameters n_i (maximal possible occurrence in the sequence) and p_i (natural richness - possibility of occurrence in the universe). For the oxygen molecules, we have to take into account that oxygen occurs in a diatomic molecule. Calculation of the sulfur distribution takes into account its rare occurrence.
The final isotopic distribution is computed as a convolution of obtained
distributions with probabilities greater than min_probability
. It is a
vector of probabilities of possible monoisotopic masses. The number of
exchangeable amides is computed as the length of the sequence, reduced by the
number of prolines located on the third of further position.