Identifies conserved molecular dynamics simulation (MDS) waters from a collection of PDBs.
ConservedWaters.MDS(prefix = "", cluster = 2.4, chain = "all",
prot.h2o.dist.min = 5.1, cluster.method = "complete",
filename = "ProteinSystem")
Directory of aligned structures; string.
Oxygen atoms within 2.4 Angstroms or less of each other are considered a cluster; numeric. Default value is 2.4 Angstroms.
The chain to examine. The user can define "first" and the first
chain alphabetically will be selected; this is the default. Defining "all"
will result in all chains being explored. Alternatively the user can define
individual the chains to include in the analysis; for example, c("A", "B", "C")
. When defining chains, the chain designation must
be characters.
The minimum distance (in Angstroms) between the protein and waters to be considered for the conserved water clusters. Water oxygen atoms greater than this distance are removed from the analysis. Default value is 5.10 Angstroms.
Method of clustering the waters; default is "complete".
Any other method accepted by the stats::hclust()
or
fastcluster::hclust()
functions are appropriate. The original method used
by Sanschagrin and Kuhn is the complete linkage clustering method and is
the default. Other options include "ward.D" (equivilant to the only Ward
option in R
versions 3.0.3 and earlier), "ward.D2" (implements Ward's
1963 criteria; see Murtagh and Legendre 2014), or "single" (related to the
minimal spanning tree method and adopts a "friend of friends" clustering
method). Please see fastcluster::hclust()
for additional and complete
information regarding clustering explanations.
The filename prefix for the returned results. Default is "ProteinSystem"
This function returns:
h2o.cluster.all: Clusters constructed from all waters present in the aligned PDB structures.
h2o.cluster.passed: Clusters constructed from waters that passed
the Mobility()
and NormalizedBvalue()
evaluations.
h2o.cluster.summary: Summary of water clusters
Excel workbook: containing the Cluster Statistics, Cluster Summaries for all and passed waters, Occurrence Summaries for all and passed waters, and the Initial Water Data data as individual tabs
call: The user provided parameters for the function
Only atoms within (less than or equal to) 5.10 Angstroms of the protein structures are included.
Paul C Sanschagrin and Leslie A Kuhn. Cluster analysis of consensus water sites in thrombin and trypsin shows conservation between serine proteases and contributions to ligand specificity. Protein Science, 1998, 7 (10), pp 2054-2064. DOI: 10.1002/pro.5560071002 PMID: 9792092 WatCH webpage
Hitesh Patel, Bjorn A. Gruning, Stefan Gunther, and Irmgard Merfort. PyWATER: a PyMOL plug-in to find conserved water molecules in proteins by clustering. Bioinformatics, 2014, 30 (20), pp 2978-2980. DOI: 10.1093/bioinformatics/btu424 PMID: 24990608 PyWATER on GitHub