- .data
The data to be analyzed. It can be a data frame, possible with
grouped data passed from dplyr::group_by()
.
- ...
The variables in .data
to compute the distances. Set to
NULL
, i.e., all the numeric variables in .data
are used.
- by
One variable (factor) to compute the function by. It is a shortcut
to dplyr::group_by()
. To compute the statistics by more than
one grouping variable use that function.
- scale
Should the data be scaled before computing the distances? Set to
FALSE. If TRUE, then, each observation will be divided by the standard
deviation of the variable Z_ij = X_ij / sd_j
- selvar
Logical argument, set to FALSE
. If TRUE
, then an
algorithm for selecting variables is implemented. See the section
Details for additional information.
- verbose
Logical argument. If TRUE
(default) then the results
for variable selection are shown in the console.
- distmethod
The distance measure to be used. This must be one of
'euclidean'
, 'maximum'
, 'manhattan'
,
'canberra'
, 'binary'
, 'minkowski'
, 'pearson'
,
'spearman'
, or 'kendall'
. The last three are
correlation-based distance.
- clustmethod
The agglomeration method to be used. This should be one of
'ward.D'
, 'ward.D2'
, 'single'
, 'complete'
,
'average'
(= UPGMA), 'mcquitty'
(= WPGMA), 'median'
(=
WPGMC) or 'centroid'
(= UPGMC).
- nclust
The number of clusters to be formed. Set to NA