powered by
This function performs cluster sampling on the dataframe and assigns "Yes" or "No" labels to rows based on selected clusters.
cluster_labels(df, group_col, yes_percentage)
A data frame with an additional column "Clustered_Yes_No" containing the cluster-sampled "Yes"/"No" labels.
A data frame containing the data.
A character string specifying the column to use for clustering.
A numeric value between 0 and 100 indicating the percentage of clusters to label as "Yes".
result <- cluster_labels(iris, group_col = "Species", yes_percentage = 50)
Run the code above in your browser using DataLab