Learn R Programming

tidylearn (version 0.1.0)

tl_semisupervised: Semi-Supervised Learning via Clustering

Description

Train a supervised model with limited labels by first clustering the data and propagating labels within clusters.

Usage

tl_semisupervised(
  data,
  formula,
  labeled_indices,
  cluster_method = "kmeans",
  supervised_method = "logistic",
  ...
)

Value

A tidylearn model trained on pseudo-labeled data

Arguments

data

A data frame

formula

Model formula

labeled_indices

Indices of labeled observations

cluster_method

Clustering method for label propagation

supervised_method

Supervised learning method for final model

...

Additional arguments

Examples

Run this code
# \donttest{
# Use only 10% of labels
labeled_idx <- sample(nrow(iris), size = 15)
model <- tl_semisupervised(iris, Species ~ ., labeled_indices = labeled_idx,
                           cluster_method = "kmeans", supervised_method = "logistic")
# }

Run the code above in your browser using DataLab