
This function calculates scales-based descriptors derived by Factor Analysis (FA). Users can provide customized amino acid property matrices.
extractFAScales(
x,
propmat,
factors,
scores = "regression",
lag,
scale = TRUE,
silent = TRUE
)
A length lag * p^2
named vector,
p
is the number of scales (factors) selected.
A character vector, as the input protein sequence.
A matrix containing the properties for the amino acids. Each row represent one amino acid type, each column represents one property. Note that the one-letter row names must be provided for we need them to seek the properties for each AA type.
Integer. The number of factors to be fitted. Must be no greater than the number of AA properties provided.
Type of scores to produce. The default is "regression"
,
which gives Thompson's scores, "Bartlett"
given Bartlett's weighted
least-squares scores.
The lag parameter. Must be less than the amino acids number in the protein sequence.
Logical. Should we auto-scale the property matrix
(propmat
) before doing Factor Analysis? Default is TRUE
.
Logical. Whether we print the SS loadings,
proportion of variance and the cumulative proportion of
the selected factors or not. Default is TRUE
.
Nan Xiao <https://nanx.me>
Atchley, W. R., Zhao, J., Fernandes, A. D., & Druke, T. (2005). Solving the protein sequence metric problem. Proceedings of the National Academy of Sciences of the United States of America, 102(18), 6395-6400.
x <- readFASTA(system.file("protseq/P00750.fasta", package = "protr"))[[1]]
data(AATopo)
tprops <- AATopo[, c(37:41, 43:47)] # select a set of topological descriptors
fa <- extractFAScales(x, propmat = tprops, factors = 5, lag = 7, silent = FALSE)
Run the code above in your browser using DataLab