new("CCModel")
, although it is probably never necessary
to create such an object from scratch - and not advised either.
The default model is stored in
the object PrOCoilModel
. An alternative model,
PrOCoilModelBA
, that is optimized for balanced accuracy
is available too (see below). Custom models can be loaded
from files using the function readCCModel
.
where $b$ is a constant offset, $N(p,x)$ denotes the number of
occurrences of pattern $p$ in sequence $x$, and
$w(p)$ is the weight assigned to pattern $p$. $P$
is the set of all patterns contained in the model.
In the models used in the
where $R(x)$ is a normalization value depending on the sample $x$. It is defined as follows:
The
PrOCoilModel
.
The model was created with the Note that this is not the original model as described in [Mahrenholz et al., 2011]. The models have been re-trained for version 2.0.0 of the package using a newer snapshot of PDB and newer methods. The original models are still available for download and can still be used if the user wishes to. For detailed instructions, see the package vignette.
PrOCoilModel
slightly favors dimers. This may be undesirable for some
applications. For such cases, an alternative model
PrOCoilModelBA
is available that is optimized
for balanced accuracy, i.e. it tries not to favor the larger
class - dimers -, but may therefore prefer trimers in borderline cases.
The overall misclassification probability is slightly higher for
this model than for the default model PrOCoilModel
. The model PrOCoilModelBA
was created with PSVM
[Hochreiter and Obermayer, 2006] using
the coiled coil kernel with $m=8$, $C=8$,
$\varepsilon=0.8$, class balancing, and kernel
normalization on the PDB data set (i.e. without BLAST augmentation).
The same applies as for PrOCoilModel
: this model has been
re-trained for package version 2.0.0. For detailed instructions how to
use the original models, see the package vignette.
Mahrenholz, C.C., Abfalter, I.G., Bodenhofer, U., Volkmer, R., and Hochreiter, S. (2011) Complex networks govern coiled coil oligomerization - predicting and profiling by means of a machine learning approach. Mol. Cell. Proteomics 10(5):M110.004994. DOI: 10.1074/mcp.M110.004994
Palme, J., Hochreiter, S., and Bodenhofer, U. (2015) KeBABS: an R package for kernel-based analysis of biological sequences. Bioinformatics 31(15):2574-2576. DOI: 10.1093/bioinformatics/btv176
Hochreiter, S., and Obermayer, K. (2006) Support vector machines for dyadic data. Neural Computation 18:1472-1510. DOI: 10.1162/neco.2006.18.6.1472
predict-methods
showClass("CCModel")
## show summary of default model (optimized for accuracy)
PrOCoilModel
## show weight of pattern "N..La..d"
weights(PrOCoilModel)["N..La..d"]
## show the 10 patterns that are most indicative for trimers
## (as the weights are sorted in descending order in PrOCoilModel)
weights(PrOCoilModel)[1:10]
## predict oligomerization of GCN4 wildtype
GCN4wt <- predict(PrOCoilModel,
"MKQLEDKVEELLSKNYHLENEVARLKKLV",
"abcdefgabcdefgabcdefgabcdefga")
## show summary of alternative model (optimized for balanced accuracy)
PrOCoilModelBA
## show weight of pattern "N..La..d"
weights(PrOCoilModelBA)["N..La..d"]
## show the 10 patterns that are most indicative for trimers
## (as the weights are sorted in descending order in PrOCoilModelBA)
weights(PrOCoilModelBA)[1:10]
Run the code above in your browser using DataLab