isopattern: Isotope pattern calculation

Description

The function calculates the isotopic pattern (fine structure) of a given chemical formula or a set of chemical formulas (batch calculation), based on three fast and memory efficient algorithms. The first algorithm can handle very large molecules and combinations of elements having many isotopes. Returns accurate masses, abundances and isotopic compositions of the individual isotopologues. The isotopes of elements can be defined by the user.

Usage

isopattern(isotopes, chemforms, threshold = 0.001, charge = FALSE, 
emass = 0.00054858, plotit = FALSE, algo=2, rel_to_mono = FALSE)

Arguments

isotopes

Dataframe listing all relevant isotopes, such as isotopes.

chemforms

Vector with character strings of chemical formulas, such as data set chemforms or the second column in the value of check_chemform

threshold

Abundance below which isotope peaks can be omitted, given as percentage of the most abundant isotope peak of the molecule. Set to 0 if all peaks shall be calculated.

charge

z in m/z. Either a single integer or a vector of integers with length equal to that of argument chemforms. Set to FALSE for omitting any charge calculations.

emass

Electrone mass; only relevant if charge is not set to FALSE.

plotit

Should results be plotted, TRUE/FALSE?

algo

Which algorithm to use? Type 1 or 2. See details.

rel_to_mono

Should abundances be normalized relative to the monoisotopic instead of the most abundant peak, TRUE/FALSE?

Value

List with length equal to length of vector chemforms; names of entries in list = chemical formula in chemform. Each entry in that list contains information on individual isotopologues (rows) with columns:
m/zFirst column; m/z of an isotope peak.
abundanceSecond column; abundance of an isotope peak. Abundances are set relative to the most abundant peak of the isotope pattern.
12C, 13C, 1H, 2H, ...Third to all other columns; atom counts of individual isotopes for an isotope peak.

warning

Too low values for threshold may lead to unnecessary calculation of low abundance peaks - to the extent that not enough memory is available for either of the two algorithms. This is especially critial if rel_to_mono is set to TRUE.

Details

Isotope pattern calculation can be done by chosing one of three algorithms, set by argument algo. All algorithms use hierarchical updates to derive the mass and abundance of a new isotopologue from an existing one, by steps of single isotope replacements. Memory usage is lower and in most cases faster for the first two algorithms as compared to the third, allowing for calculation of very large molecules or inclusion of elements with many isotopes. Comparable in memory allocation, the second algorithm is faster for very small molecules than the first - but much slower for larger ones. The first algorithm algo=1 uses tree-like combinatorial transitions to calculate daughter isotopologues from their parent node isotopologues, with the monoisotopic composition as root node. This approach first searches for branches of increasing abundance to find the isotopologue of maximum abundance, with transitions ordered as to minimize the occurrence of decreasing branches. The remaining branches are subsequently omitted if they (a) fall below a threshold relative to this most abundant isotopologue and (b) only contain branches of decreasing abundance. Furthermore, to avoid redundant calculations for transitions of the same isotope (but not the same isotopologues!), this global search is conducted in elementwise subtrees that are then combined. The second algorithm algo=2 does not use elementwise subtree maximum abundance searches with no differences to the first algorithm otherwise. The third algorithm algo=3 is similar to the one poposed by Li et al. (2010). Herein, mass states and abundances are calculated individually within seperate blocks for each of the elements present in a molecule without (!) abundance thresholds. These building-blocks are then combined to individual isotopologues, with peaks below the threshold abundance evenutally omitted. In the presented version, a fast calculation of elementwise building-blocks and their combination to isotopologues is implemented so as to avoid redundant calculations from both different updates or different combinations leading to the same isotopologue. Note that when rel_to_mono is set to TRUE, the abundance threshold is specified relative to the monoisotopic instead of the most abundant peak.

References

Loos, M. & Gerber, C., 201X. Tree-like hierarchy for the calculation of very large isotope patterns. To be submitted. Li, L., Karabacak, N., Cobb, J., Wang, Q., Hong, P., agar, J., 2010. Memory-efficient calculation of the isotopic mass states of a molecule. Rapid Communications in Mass Spectrometry, 24: 2689-2696.

Examples

Run this code

############################
# batch of chemforms #######
data(isotopes)
data(chemforms)
pattern<-isopattern(
  isotopes,
  chemforms,
  threshold=0.1,
  plotit=TRUE,
  charge=FALSE,
  emass=0.00054858,
  algo=2
)
############################
# Single chemical formula ##
data(isotopes) 
pattern<-isopattern(
  isotopes,
  "C100H200S2Cl5",
  threshold=0.1,
  plotit=TRUE,
  charge=FALSE,
  emass=0.00054858,
  algo=2
)
############################

Run the code above in your browser using DataLab