Learn R Programming

R Semi-Supervised Learning package

This R package provides implementations of several semi-supervised learning methods, in particular, our own work involving constraint based semi-supervised learning.

To cite the package, use either of these two references:

  • Krijthe, J. H. (2016). RSSL: R package for Semi-supervised Learning. In B. Kerautret, M. Colom, & P. Monasse (Eds.), Reproducible Research in Pattern Recognition. RRPR 2016. Lecture Notes in Computer Science, vol 10214. (pp. 104–115). Springer International Publishing. https://doi.org/10.1007/978-3-319-56414-2_8. arxiv: https://arxiv.org/abs/1612.07993
  • Krijthe, J.H. & Loog, M. (2015). Implicitly Constrained Semi-Supervised Least Squares Classification. In E. Fromont, T. de Bie, & M. van Leeuwen, eds. 14th International Symposium on Advances in Intelligent Data Analysis XIV (Lecture Notes in Computer Science Volume 9385). Saint Etienne. France, pp. 158-169.

Installation Instructions

This package available on CRAN. The easiest way to install the package is to use:

install.packages("RSSL")

To install the latest version of the package using the devtools package:

library(devtools)
install_github("jkrijthe/RSSL")

Usage

After installation, load the package as usual:

library(RSSL)

The following code generates a simple dataset, trains a supervised and two semi-supervised classifiers and evaluates their performance:

library(dplyr,warn.conflicts = FALSE)
library(ggplot2,warn.conflicts = FALSE)

set.seed(2)
df <- generate2ClassGaussian(200, d=2, var = 0.2, expected=TRUE)

# Randomly remove labels
df <- df %>% add_missinglabels_mar(Class~.,prob=0.98) 

# Train classifier
g_nm <- NearestMeanClassifier(Class~.,df,prior=matrix(0.5,2))
g_self <- SelfLearning(Class~.,df,
                       method=NearestMeanClassifier,
                       prior=matrix(0.5,2))

# Plot dataset
df %>% 
  ggplot(aes(x=X1,y=X2,color=Class,size=Class)) +
  geom_point() +
  coord_equal() +
  scale_size_manual(values=c("-1"=3,"1"=3), na.value=1) +
  geom_linearclassifier("Supervised"=g_nm,
                  "Semi-supervised"=g_self)


# Evaluate performance: Squared Loss & Error Rate
mean(loss(g_nm,df))
mean(loss(g_self,df))


mean(predict(g_nm,df)!=df$Class)
mean(predict(g_self,df)!=df$Class)

Acknowledgement

Work on this package was supported by Project 23 of the Dutch national program COMMIT.

Copy Link

Version

Install

install.packages('RSSL')

Monthly Downloads

745

Version

0.9.8

License

GPL (>= 2)

Issues

Pull Requests

Stars

Forks

Maintainer

Jesse Krijthe

Last Published

October 21st, 2025

Functions in RSSL (0.9.8)

EMLeastSquaresClassifier

An Expectation Maximization like approach to Semi-Supervised Least Squares Classification
LinearSVM

Linear SVM Classifier
LeastSquaresClassifier

Least Squares Classifier
LinearSVM-class

LinearSVM Class
LinearTSVM

Linear CCCP Transductive SVM classifier
LogisticLossClassifier-class

LogisticLossClassifier
KernelLeastSquaresClassifier

Kernelized Least Squares Classifier
LaplacianKernelLeastSquaresClassifier

Laplacian Regularized Least Squares Classifier
LaplacianSVM

Laplacian SVM classifier
LearningCurveSSL

Compute Semi-Supervised Learning Curve
MCLinearDiscriminantClassifier

Moment Constrained Semi-supervised Linear Discriminant Analysis.
PreProcessingPredict

Preprocess the input for a new set of test objects for classifier
NearestMeanClassifier

Nearest Mean Classifier
MajorityClassClassifier

Majority Class Classifier
PreProcessing

Preprocess the input to a classification function
LogisticRegressionFast

Logistic Regression implementation that uses R's glm
LinearDiscriminantClassifier

Linear Discriminant Classifier
MCPLDA

Maximum Contrastive Pessimistic Likelihood Estimation for Linear Discriminant Analysis
MCNearestMeanClassifier

Moment Constrained Semi-supervised Nearest Mean Classifier
LogisticLossClassifier

Logistic Loss Classifier
QuadraticDiscriminantClassifier

Quadratic Discriminant Classifier
LogisticRegression

(Regularized) Logistic Regression implementation
SVM

SVM Classifier
SSLDataFrameToMatrices

Convert data.frame to matrices for semi-supervised learners
S4VM-class

LinearSVM Class
RSSL-package

RSSL: Implementations of Semi-Supervised Learning Approaches for Classification
SelfLearning

Self-Learning approach to Semi-supervised Learning
clapply

Use mclapply conditional on not being in RStudio
cov_ml

Biased (maximum likelihood) estimate of the covariance matrix
TSVM

Transductive SVM classifier using the convex concave procedure
generate2ClassGaussian

Generate data from 2 Gaussian distributed classes
gaussian_kernel

calculated the gaussian kernel matrix
WellSVM_supervised

A degenerated version of WellSVM where the labels are complete, that is, supervised learning
add_missinglabels_mar

Throw out labels at random
find_a_violated_label

Find a violated label
diabetes

diabetes data for unit testing
measure_accuracy

Performance measures used in classifier evaluation
USMLeastSquaresClassifier

Updated Second Moment Least Squares Classifier
USMLeastSquaresClassifier-class

USMLeastSquaresClassifier
adjacency_knn

Calculate knn adjacency matrix
generateABA

Generate data from 2 alternating classes
predict,scaleMatrix-method

Predict for matrix scaling inspired by stdize from the PLS package
geom_classifier

Plot RSSL classifier boundary (deprecated)
missing_labels

Access the true labels for the objects with missing labels when they are stored as an attribute in a data frame
geom_linearclassifier

Plot linear RSSL classifier boundary
plot.CrossValidation

Plot CrossValidation object
S4VM

Safe Semi-supervised Support Vector Machine (S4VM)
rssl-predict

Predict using RSSL classifier
sample_k_per_level

Sample k indices per levels from a factor
print.CrossValidation

Print CrossValidation object
plot.LearningCurve

Plot LearningCurve object
stat_classifier

Plot RSSL classifier boundaries
stderror

Calculate the standard error of the mean from a vector of numbers
loss

Loss of a classifier or regression function
posterior

Class Posteriors of a classifier
losslogsum

LogsumLoss of a classifier or regression function
svdinvsqrtm

Taking the inverse of the square root of the matrix using the singular value decomposition
svdinv

Inverse of a matrix using the singular value decomposition
summary.CrossValidation

Summary of Crossvalidation results
WellSVM

WellSVM for Semi-supervised Learning
WellSVM_SSL

Convex relaxation of S3VM by label generation
wlda_loglik

Measures the expected log-likelihood of the LDA model defined by m, p, and iW on the data set a, where weights w are potentially taken into account
svdsqrtm

Taking the square root of a matrix using the singular value decomposition
wellsvm_direct

wellsvm implements the wellsvm algorithm as shown in [1].
true_labels

Access the true labels when they are stored as an attribute in a data frame
threshold

Refine the prediction to satisfy the balance constraint
wdbc

wdbc data for unit testing
wlda

Implements weighted likelihood estimation for LDA
testdata

Example semi-supervised problem
split_random

Randomly split dataset in multiple parts
print.LearningCurve

Print LearningCurve object
localDescent

Local descent
split_dataset_ssl

Create Train, Test and Unlabeled Set
logsumexp

Numerically more stable way to calculate log sum exp
projection_simplex

Project an n-dim vector y to the simplex Dn
svmproblem

Train SVM
wlda_error

Measures the expected error of the LDA model defined by m, p, and iW on the data set a, where weights w are potentially taken into account
c.CrossValidation

Merge result of cross-validation runs on single datasets into a the same object
decisionvalues

Decision values returned by a classifier for a set of objects
generateCrescentMoon

Generate Crescent Moon dataset
generateTwoCircles

Generate data from 2 circles
df_to_matrices

Convert data.frame with missing labels to matrices
generateSpirals

Generate Intersecting Spirals
generateFourClusters

Generate Four Clusters dataset
generateSlicedCookie

Generate Sliced Cookie dataset
generateParallelPlanes

Generate Parallel planes
minimaxlda

Implements weighted likelihood estimation for LDA
losspart

Loss of a classifier or regression function evaluated on partial labels
scaleMatrix

Matrix centering and scaling
harmonic_function

Direct R Translation of Xiaojin Zhu's Matlab code to determine harmonic solution
solve_svm

SVM solve.QP implementation
svmlin

svmlin implementation by Sindhwani & Keerthi (2006)
rssl-formatting

Show RSSL classifier
line_coefficients

Loss of a classifier or regression function
svmlin_example

Test data from the svmlin implementation
responsibilities

Responsibilities assigned to the unlabeled objects
ICLinearDiscriminantClassifier

Implicitly Constrained Semi-supervised Linear Discriminant Classifier
EMNearestMeanClassifier

Semi-Supervised Nearest Mean Classifier using Expectation Maximization
EMLinearDiscriminantClassifier

Semi-Supervised Linear Discriminant Analysis using Expectation Maximization
ICLeastSquaresClassifier

Implicitly Constrained Least Squares Classifier
BaseClassifier

Classifier used for enabling shared documenting of parameters
CrossValidationSSL

Cross-validation in semi-supervised setting
EntropyRegularizedLogisticRegression

Entropy Regularized Logistic Regression
GRFClassifier

Label propagation using Gaussian Random Fields and Harmonic functions
KernelICLeastSquaresClassifier

Kernelized Implicitly Constrained Least Squares Classification