ob.stability: Estimate the stability of a clustering based on non-parametric bootstrap
out-of-bag scheme, with option for subsampling scheme
Description
Estimate the stability of a clustering based on non-parametric bootstrap
out-of-bag scheme, with option for subsampling scheme
Usage
ob.stability(x, k, B = 500, r = 5, subsample = FALSE, cut_ratio = 0.5)
Value
membership
vector of membership for each observation from the reference clustering
obs_wise
vector of estimated observation-wise stability
clust_wise
vector of estimated cluster-wise stability
overall
numeric estimated overall stability
Smin
numeric estimated Smin through out-of-bag scheme
Arguments
x
data.frame of the data set where the rows as observations and columns as dimensions of features
k
number of clusters for which to estimate the stability
B
number of bootstrap re-samples
r
integer parameter in the kmeansCBI() funtion
subsample
logical parameter to use the subsampling scheme option in the resampling process (instead of bootstrap)
cut_ratio
numeric parameter between 0 and 1 for subsampling scheme training set ratio
Author
Tianmou Liu
Details
This function estimates the stability through out-of-bag observations
It estimate the stability at the
(1) observation level, (2) cluster level, and (3) overall.
References
Bootstrapping estimates of stability for clusters, observations and model selection.
Han Yu, Brian Chapman, Arianna DiFlorio, Ellen Eischen, David Gotz, Matthews Jacob and Rachael Hageman Blair.
# \donttest{set.seed(123)
data(iris)
df <- data.frame(iris[,1:4])
# You can choose to scale df before clustering by # df <- scale(df)ob.stability(df, k = 2, B=500, r=5)
# }