The files in the subdirectories of extdata
provide additional thermodynamic data and other data to support the examples in the package documentation and vignettes.
See thermo
for a description of the files in extdata/OBIGT
, which are used to generate the thermodynamic database.
Files in Berman
contain thermodynamic data for minerals using the Berman formulation:
Ber88_1988.csv
contains thermodynamic data for minerals taken from Berman (1988).
Other files with names like xxx_yyyy.csv
contain thermodynamic data from other sources; xxx in the filename corresponds to the reference in thermo$OBIGT
and yyyy gives the year of publication.
berman
uses these data for the calculation of thermodynamic properties at specified and , which are then available for use in subcrt
.
If there are any duplicated mineral names in the files, only the most recent data are used, as determined by the year in the file name.
Following conventions used in other data files, the names of sanidine and microcline were changed to K-feldspar,high and K-feldspar,low.
sympy.R
is an R script that uses rSymPy to symbolically integrate Bermans's equations for heat capacity and volume to write experessions for enthalpy, entropy and Gibbs energy.
The testing
directory contains data files based on Berman and Aranovich (1996). These are used to demonstrate the addition of data from a user-supplied file (see berman
).
Files in bison
contain BLAST results and taxonomic information for an environmental metagenome from the Bison Pool hot spring in Yellowstone National Park:
bisonN_vs_refseq57.blast.xz
, bisonS...
, bisonR...
, bisonQ...
, bisonP...
are partial tabular BLAST results for proteins in the Bison Pool Environmental Genome. Protein sequences predicted in the metagenome were downloaded from the Joint Genome Institute's IMG/M system on 2009-05-13. The target database for the searches was constructed from microbial protein sequences in National Center for Biotechnology Information (NCBI) RefSeq database version 57, representing 7415 microbial genomes. The ‘blastall’ command was used with the default setting for E value cuttoff (10.0) and options to make a tabular output file consisting of the top 20 hits for each query sequence. The function read.blast
was used to extract only those hits with E values less than or equal to 1e-5 and with sequence similarity (percent identity) at least 30 percent, and to keep only the first hit for each query sequence. The function write.blast
was used to save partial BLAST files (only selected columns). The files provided with CHNOSZ contain the first 1000 hits for each sampling site at Bison Pool, representing between about 1.5 to 3 percent of the first BLAST hits after similarity and E value filtering.
gi.taxid.txt.xz
is a table that lists the sequence identifiers (gi numbers) that appear in the example BLAST files (see above), together with the corresponding taxon ids used in the NCBI databases.
taxid_names.csv.xz
A table of scientific names for the taxids in gi.taxid.txt.xz
.
See id.blast
for examples that use these files.
Files in cpetc
contain experimental and calculated thermodynamic and environmental data:
PM90.csv
Heat capacities of four unfolded aqueous proteins taken from Privalov and Makhatadze, 1990. Temperature in C is in the first column, and heat capacities of the proteins in J mol\(^{-1}\) K\(^{-1}\) in the remaining columns. See ionize.aa
and the vignette for examples that use this file.
RH95.csv
Heat capacity data for iron taken from Robie and Hemingway, 1995. Temperature in Kelvin is in the first column, heat capacity in J K\(^{-1}\) mol\(^{-1}\) in the second. See subcrt
for an example that uses this file.
SOJSH.csv
Experimental equilibrium constants for the reaction NaCl(aq) = Na+ + Cl- as a function of temperature and pressure taken from Fig. 1 of Shock et al., 1992. See demo("NaCl")
for an example that uses this file.
Cp.CH4.HW97.csv
, V.CH4.HWM96.csv
Apparent molar heat capacities and volumes of CH4 in dilute aqueous solutions reported by Hn<U+011B>dkovsk<U+00FD> and Wood, 1997 and Hn<U+011B>dkovsk<U+00FD> et al., 1996. See EOSregress
and the vignette for examples that use these files.
SC10_Rainbow.csv
Values of temperature (C, pH and logarithms of activity of , , , and for mixing of seawater and hydrothermal fluid at Rainbow field (Mid-Atlantic Ridge), taken from Shock and Canovas, 2010. See the vignette for an example that uses this file.
SS98_Fig5a.csv
, SS98_Fig5b.csv
Values of logarithm of fugacity of and pH as a function of temperature for mixing of seawater and hydrothermal fluid, digitized from Figs. 5a and b of Shock and Schulte, 1998. See the vignette for an example that uses this file.
rubisco.csv
UniProt IDs for Rubisco, ranges of optimal growth temperature of organisms, domain and name of organisms, and URL of reference for growth temperature, from Dick, 2014. See the vignette for an example that uses this file.
bluered.txt
Blue - light grey - red color palette, computed using colorspace::diverge_hcl(1000,
c = 100, l = c(50, 90), power = 1)
. This is used by ZC.col
.
AD03_Fig1?.csv
Experimental data points digitized from Figure 1 of Akinfiev and Diamond, 2003, used in demo("AkDi")
.
TKSS14_Fig2.csv
Experimental data points digitized from Figure 2 of Tutolo et al., 2014, used in demo("aluminum")
.
Mer75_Table4.csv
Values of log(aK+/aH+) and log(aNa+/aH+) from Table 4 of Merino, 1975, used in demo("aluminum")
.
Files in protein
contain protein sequences and amino acid compositions for proteins.
EF-Tu.aln
consists of aligned sequences (394 amino acids) of elongation factor Tu (EF-Tu). The sequences correspond to those taken from UniProtKB for ECOLI (Escherichia coli), THETH (Thermus thermophilus) and THEMA (Thermotoga maritima), and reconstructed ancestral sequences taken from Gaucher et al., 2003 (maximum likelihood bacterial stem and mesophilic bacterial stem, and alternative bacterial stem). See read.fasta
for an example that uses this file.
rubisco.fasta
Sequences of Rubisco obtained from UniProt (see Dick, 2014). See the vignette for an example that uses this file.
POLG.csv
Amino acid compositions of a few proteins used for some tests and examples.
These are various subunits of the Poliovirus type 1 polyprotein (POLG_POL1M in UniProt).
Files in taxonomy
contain taxonomic data files:
names.dmp
and nodes.dmp
are excerpts of the taxonomy files available on the NCBI ftp site (ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdump.tar.gz, accessed 2010-02-15). These files contain only the entries for Escherichia coli K-12, Saccharomyces cerevisiae, Homo sapiens, Pyrococcus furisosus and Methanocaldococcus jannaschii (taxids 83333, 4932, 9606, 186497, 243232) and the higher-ranking nodes (genus, family, etc.) in the respective lineages. See taxonomy
for examples that use these files.
Files in adds
contain additional thermodynamic data and group additivity definitions:
BZA10.csv
contains supplementary thermodynamic data taken from Bazarkina et al. (2010). The data can be added to the database in the current session using add.OBIGT
. See add.OBIGT
for an example that uses this file.
OBIGT_check.csv
contains the results of running check.OBIGT
to check the internal consistency of entries in the default and optional datafiles.
RH98_Table15.csv
Group stoichiometries for high molecular weight crystalline and liquid organic compounds taken from Table 15 of Richard and Helgeson, 1998. The first three columns have the compound
name, formula
and physical state
(cr or liq). The remaining columns have the numbers of each group in the compound; the names of the groups (columns) correspond to species in thermo$OBIGT
. The compound named 5a(H),14a(H)-cholestane in the paper has been changed to 5a(H),14b(H)-cholestane here to match the group stoichiometry given in the table. See RH2OBIGT
for a function that uses this file.
SK95.csv
contains thermodynamic data for alanate, glycinate, and their complexes with metals, taken from Shock and Koretsky (1995) as corrected in slop98.dat. The data are used in the package tests (test-recalculate.R
) to check the recalculated values of G, H, and S in thermo()$OBIGT
using properties for alanate and glycinate from Amend and Helgeson (1997).
LA19_test.csv
contains thermodynamic data for dimethylamine and trimethylamine from LaRowe and Amend (2019) in energy units of both J and cal. This file is used in test-util.data.R
) to check the messages produced by checkGHS
and checkEOS
.
Akinfiev, N. N. and Diamond, L. W. (2003) Thermodynamic description of aqueous nonelectrolytes at infinite dilution over a wide range of state parameters. Geochim. Cosmochim. Acta 67, 613--629. 10.1016/S0016-7037(02)01141-9
Amend, J. P. and Helgeson, H. C. (1997) Calculation of the standard molal thermodynamic properties of aqueous biomolecules at elevated temperatures and pressures. Part 1. L--amino acids. J. Chem. Soc., Faraday Trans. 93, 1927--1941. 10.1039/A608126F
Bazarkina, E. F., Zotov, A. V. and Akinfiev, N. N. (2010) Pressure-dependent stability of cadmium chloride complexes: Potentiometric measurements at 1<U+2013>1000 bar and 25<U+00B0>C. Geol. Ore Deposits 52, 167--178. 10.1134/S1075701510020054
Berman, R. G. (1988) Internally-consistent thermodynamic data for minerals in the system NaO-KO-CaO-MgO-FeO-FeO-AlO-SiO-TiO-HO-CO. J. Petrol. 29, 445-522. 10.1093/petrology/29.2.445
Berman, R. G. and Aranovich, L. Ya. (1996) Optimized standard state and solution properties of minerals. I. Model calibration for olivine, orthopyroxene, cordierite, garnet, and ilmenite in the system FeO-MgO-CaO-AlO-TiO-SiO. Contrib. Mineral. Petrol. 126, 1-24. 10.1007/s004100050233
Dick, J. M. (2014) Average oxidation state of carbon in proteins. J. R. Soc. Interface 11, 20131095. 10.1098/rsif.2013.1095
Gattiker, A., Michoud, K., Rivoire, C., Auchincloss, A. H., Coudert, E., Lima, T., Kersey, P., Pagni, M., Sigrist, C. J. A., Lachaize, C., Veuthey, A.-L., Gasteiger, E. and Bairoch, A. (2003) Automatic annotation of microbial proteomes in Swiss-Prot. Comput. Biol. Chem. 27, 49--58. 10.1016/S1476-9271(02)00094-4
Gaucher, E. A., Thomson, J. M., Burgan, M. F. and Benner, S. A (2003) Inferring the palaeoenvironment of ancient bacteria on the basis of resurrected proteins. Nature 425(6955), 285--288. 10.1038/nature01977
Hn<U+011B>dkovsk<U+00FD>, L., Wood, R. H. and Majer, V. (1996) Volumes of aqueous solutions of , , , and at temperatures from 298.15 K to 705 K and pressures to 35 MPa. J. Chem. Thermodyn. 28, 125--142. 10.1006/jcht.1996.0011
Hn<U+011B>dkovsk<U+00FD>, L. and Wood, R. H. (1997) Apparent molar heat capacities of aqueous solutions of , , , and at temperatures from 304 K to 704 K at a pressure of 28 MPa. J. Chem. Thermodyn. 29, 731--747. 10.1006/jcht.1997.0192
Joint Genome Institute (2007) Bison Pool Environmental Genome. Protein sequence files downloaded from IMG/M (https://img.jgi.doe.gov/)
LaRowe, D. E. and Amend, J. P. (2019) The energetics of fermentation in natural settings. Geomicrobiol. J. 36, 492--505. 10.1080/01490451.2019.1573278
Merino, E. (1975) Diagenesis in teriary sandstones from Kettleman North Dome, California. II. Interstitial solutions: distribution of aqueous species at 100°C and chemical relation to diagenetic mineralogy. Geochim. Cosmochim. Acta 39, 1629--1645. 10.1016/0016-7037(75)90085-X
Privalov, P. L. and Makhatadze, G. I. (1990) Heat capacity of proteins. II. Partial molar heat capacity of the unfolded polypeptide chain of proteins: Protein unfolding effects. J. Mol. Biol. 213, 385--391. 10.1016/S0022-2836(05)80198-6
Richard, L. and Helgeson, H. C. (1998) Calculation of the thermodynamic properties at elevated temperatures and pressures of saturated and aromatic high molecular weight solid and liquid hydrocarbons in kerogen, bitumen, petroleum, and other organic matter of biogeochemical interest. Geochim. Cosmochim. Acta 62, 3591--3636. 10.1016/S0016-7037(97)00345-1
Robie, R. A. and Hemingway, B. S. (1995) Thermodynamic Properties of Minerals and Related Substances at 298.15 K and 1 Bar (\(10^5\) Pascals) Pressure and at Higher Temperatures. U. S. Geol. Surv., Bull. 2131, 461 p. https://www.worldcat.org/oclc/32590140
Shock, E. and Canovas, P. (2010) The potential for abiotic organic synthesis and biosynthesis at seafloor hydrothermal systems. Geofluids 10, 161--192. 10.1111/j.1468-8123.2010.00277.x
Shock, E. L. and Koretsky, C. M. (1995) Metal-organic complexes in geochemical processes: Estimation of standard partial molal thermodynamic properties of aqueous complexes between metal cations and monovalent organic acid ligands at high pressures and temperatures. Geochim. Cosmochim. Acta 59, 1497--1532. 10.1016/0016-7037(95)00058-8
Shock, E. L., Oelkers, E. H., Johnson, J. W., Sverjensky, D. A. and Helgeson, H. C. (1992) Calculation of the thermodynamic properties of aqueous species at high pressures and temperatures: Effective electrostatic radii, dissociation constants and standard partial molal properties to 1000 C and 5 kbar. J. Chem. Soc. Faraday Trans. 88, 803--826. 10.1039/FT9928800803
Shock, E. L. and Schulte, M. D. (1998) Organic synthesis during fluid mixing in hydrothermal systems. J. Geophys. Res. 103, 28513--28527. 10.1029/98JE02142
Tutolo, B. M., Kong, X.-Z., Seyfried, W. E., Jr. and Saar, M. O. (2014) Internal consistency in aqueous geochemical data revisited: Applications to the aluminum system. Geochim. Cosmochim. Acta 133, 216--234. 10.1016/j.gca.2014.02.036