Learn R Programming

vanddraabe: Identification and Statistical Analysis of Conserved Waters Near Proteins

vanddraabe provides a powerful way to identify and analyze conserved waters within crystallographic protein structures and molecular dynamics simulation trajectories. Statistical parameters for each water cluster, informative graphs, and a PyMOL session file to visually explore the conserved waters and protein are returned. Hydrophilicity is the propensity of waters to congregate near specific protein atoms and is related to conserved waters. An informatics derived set of hydrophilicity values are provided based on a large, high-quality X-ray protein structure dataset.

This package is a reimplementation and expansion of the WatCH1 and PyWATER2 applications and was created to provide the following abilities:

  • Perform conserved water analysis on RCSB files and molecular dynamics simulations
  • Provide access to data at each step of the analysis
  • Provide detailed statistical summaries for all waters being analyzed
  • Ability to analyze more than 65,000 waters
  • Create preformatted analysis plots
  • Create PyMOL session scripts to visualize the conserved waters
  • Write out an Excel workbook of initial, intermediate, and final results
  • Perform protein alignment using bio3d (CRAN, website, and BitBucket), an opensource applicaiton
  1. Paul C Sanschagrin and Leslie A Kuhn. Cluster analysis of consensus water sites in thrombin and trypsin shows conservation between serine proteases and contributions to ligand specificity. Protein Science, 1998, 7 (10), pp 2054-2064.
    DOI: 10.1002/pro.5560071002
    PMID: 9792092
    WatCH webpage

  2. Hitesh Patel, Bjorn A. Gruning, Stefan Gunther, and Irmgard Merfort. PyWATER: a PyMOL plug-in to find conserved water molecules in proteins by clustering. Bioinformatics, 2014, 30 (20), pp 2978-2980.
    DOI: 10.1093/bioinformatics/btu424
    PMID: 24990608
    PyWATER on GitHub

Installing vanddraabe

vanddraabe is available on GitHub and on CRAN. To install it:

# The easiest way to get vanddraabe is:
install.packages("vanddraabe")

# Or get the development version from GitHub:
# install.packages("devtools")
devtools::install_github("exeResearch/vanddraabe")

How to use vanddraabe

The vignette provided here is a detailed example of using vanddraabe to identify the conserved waters of ten Thrombin structures.

Have a suggestion? Need help? Found a bug?

Code of conduct

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

Copy Link

Version

Install

install.packages('vanddraabe')

Monthly Downloads

11

Version

1.1.1

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Emilio Xavier Esposito

Last Published

June 7th, 2019

Functions in vanddraabe (1.1.1)

BoundWaterEnvironment.quality

Bound Water Environment (atomic quality)
ConservedWaters

Conserved Crystallographic Waters
ConservedWaterStats

Conserved Water Statistics
ConservedWaters.MDS

Conserved Molecular Dynamics Simulation Waters
ConservationSet

Conservation Set
DetermineChainsOfInterest

Determine Chains Of Interest
CreatePyMOLscript

Create PyMOL Script File
ConservationPlot

Conservation Plot (Number of Waters Per Cluster Histogram)
ClusterWaters

Cluster Conserved Waters
HydrophilicityEvaluation

Hydrophilicity Evaluation
HydrophilicityTable

Residue Atom Type Hydrophilicity Values
Mobility

Water Molecule Mobility
MobNormBvalEvalPlots

Mobility and Normalized B-values Evaluation Plots
ExtractFileTimeStamp

Extract Filename Time Stamp
OccupancyBarplot

Occupancy Barplots
ExtractPDBids

Extract PDB IDs
OccupancyBarplot.summ

Occupancy Summary Barplots
MobilityBarplot.summ

Mobility Summary Barplots
MobilityBarplot

Mobility Barplots
ClusterSummaryPlots

Cluster Summary Plots
FreeSASAcheck

FreeSASA Check
HasXWaters

Has "X" Waters
ClusterWaters.MDS

Cluster Conserved Waters (MDS)
StandardizeGlutamicAcidNames

Standardize Glutamic Acid Names
ProtHetWatIndices

Protein, HET, and Water Atom Indices
TimeSpan

Time Span
StandardizeCysteineNames

Standardize Cysteine Names
RemoveHydrogenAtoms

Remove Hydrogen and Deuterium Atoms
ReturnPDBfullPath

Return PDB Full Path
getResTypeCounts

Get ResType Counts
names.sidechain.atoms

Sidechain Atom Names
names.residues

Residue Names
getRCSBdata

Clean RCSB Dataset
calcNearbyHydrationFraction

Calculate Nearby Atom Hydration Fraction
Nearby

Nearby
RescaleValues

Rescale Values
RetainChainsOfInterest

Retain Chains Of Interest
RemoveOoR.o

Remove Occupancy Out of Range Atoms
NormalizedBvalue

B-value Normalization
UniqueAtomHashes

Create Unique Atom Hashes
RetainWatersWithinX

Retain Waters Within X Angstroms of Protein
StandardizeHistidineNames

Standardize Histidine Names
getProtAtomsNearWater

Number of Solvent Accessible/Exposed Protein Atoms Near a Water
getAtomTypeCounts

Get AtomType Counts
names.waters

Water Residue Names
StandardizeLysineNames

Standardize Lysine Names
normBvalueBarplot.summ

Normalized B-value Summary Barplots
StandardizeAsparticAcidNames

Standardize Aspartic Acid Names
FileTimeStamp

Filename Time Stamp
FreeSASA.diff

Atomic SASA difference of hydrated PDB via FreeSASA
oxInitWaterDataSheet

Initial Water Data Sheet
aaStandardizeNames

Standardize Amino Acid Names
calcNumHydrogenBonds

Calculate Number of Hydrogen Bonds
PDB.1ecd

PDB Structure of Erythrocruorin
oxPDBcleanedSummarySheet

Cleaned PDB Structures Data Sheet
nBvalueBarplot

Normalized B-value Barplots
getResidueData

Number of Residues and Solvent Accessible/Exposed Residues
names.backbone.atoms

Backbone Atom Names
calcAtomClassHydrophilicity

Atom Class Hydration Fraction
names.polar.atoms

Polar Atom Names
PDB.5rxn

PDB Structure of Rubredoxin
names.res.AtomTypes

Residue and AtomType Names
colorPalettes

Color Values for Plots
check.cluster.method

Check Clustering Method
names.resATs.carb.sulf

Carbon and Sulfur Residue-AtomType Names
res2xyz

Residue Indices to Coordinate Indices
RemoveOoR.b

Remove B-value Out of Range Atoms
RemoveModeledAtoms

Remove Modeled Atoms
calcBvalue

Calculate B-value
oxWaterOccurrenceSheet

openxlsx Water Occurrence Summary
calcAtomHydrationEstimate

Estimated Atomic Hydration Fraction
names.resATs.nitro.neut

Neutral Nitrogen Residue-AtomType Names
oxPlainDataSheet

Plain Data Sheet
write.basic.pdb

Write Basic PDB File
oxRCSBinfoSheet

openxlsx PDB/RCSB Summary Sheet
write.conservedWaters.pdb

Write Conserved Waters to PDB File
resAtomType2AtomClass

Convert Residue-AtomType to AtomType Class
oxClusterStatsSheet

openxlsx Water Cluster Statistics
oxClusterSummarySheet

openxlsx Cluster Summary Sheet
names.resATs.nitro.pos

Positive Nitrogen Residue-AtomType Names
thrombin.1hai

PDB Structure of Thrombin
names.resATs.oxy.neg

Negative Oxygen Residue-AtomType Names
thrombin10.PDBs.align

Thrombin10 Vignette's Primary Sequence Alignment
openxlsxCellStyles

openxlsx Cell Style
oxAlignOverlapSheet

Align Overlap Data Sheet
names.resATs.oxy.neut

Neutral Oxygen Residue-AtomType Names
vanddraabe

vanddraabe: Identification and Statistical Analysis of Conserved Waters in Proteins
AlignOverlap

Alignment Overlap Check
BvalueBarplot.summ

B-value Summary Barplots
BoundWaterEnvSummaryPlot

Bound Water Environment Summary Plot
BvalueBarplot

B-value Barplots
BoundWaterEnvPlots

Bound Water Environment Barplots
CleanProteinStructures

Clean Protein Structures
CalcAlignOverlap

Calculate Alignment Overlap
BoundWaterEnvironment.interact

Bound Water Environment (interactions)
BoundWaterEnvironment

Bound Water Environment