a "proc" object or a square matrix. If a squared matrix is
provided, it is treated as the dissimilary matrix of a group of response processes.
K_cand
the candidates of the number of features.
dist_type
a character string specifies the dissimilarity measure for two
response processes. See 'Details'.
n_fold
the number of folds for cross-validation.
max_epoch
the maximum number of epochs for stochastic gradient
descent.
step_size
the step size of stochastic gradient descent.
tot
the accuracy tolerance for determining convergence.
return_dist
logical. If TRUE, the dissimilarity matrix will be
returned. Default is FALSE.
seed
random seed.
L_set
length of ngrams considered
Value
chooseK_mds returns a list containing
K
the value in K_cand producing the smallest cross-validation loss.
K_cand
the candidates of the number of features.
cv_loss
the cross-validation loss for each candidate in K_cand.
dist_mat
the dissimilary matrix. This element exists only if return_dist=TRUE.
References
Gomez-Alonso, C. and Valls, A. (2008). A similarity measure for sequences of
categorical data based on the ordering of common elements. In V. Torra & Y. Narukawa (Eds.)
Modeling Decisions for Artificial Intelligence, (pp. 134-145). Springer Berlin Heidelberg.
See Also
seq2feature_mds for feature extraction after choosing
the number of features.