Learn R Programming

arf: Adversarial random forests

Introduction

Adversarial random forests (ARFs) recursively partition data into fully factorized leaves, where features are jointly independent. The procedure is iterative, with alternating rounds of generation and discrimination. Data become increasingly realistic at each round, until original and synthetic samples can no longer be reliably distinguished. This is useful for several unsupervised learning tasks, such as density estimation and data synthesis. Methods for both are implemented in this package. ARFs naturally handle unstructured data with mixed continuous and categorical covariates. They inherit many of the benefits of RFs, including speed, flexibility, and solid performance with default parameters.

Installation

The arf package is available on CRAN:

install.packages("arf")

To install the development version from GitHub using devtools, run:

devtools::install_github("bips-hb/arf")

Examples

Using Fisher's iris dataset, we train an ARF and estimate distribution parameters:

# Train the ARF
arf <- adversarial_rf(iris)

# Estimate distribution parameters
psi <- forde(arf, iris)

Density estimation

To estimate log-likelihoods:

mean(lik(arf, psi, iris))

Generative modeling

To generate 100 synthetic samples:

forge(psi, 100)

Conditional expectations

To estimate the mean of some variable(s), optionally conditioned on some event(s):

evi <- data.frame(Species = "setosa")
expct(psi, query = "Sepal.Length", evidence = evi)

For more detailed examples, see the package vignette.

Python library

A Python implementation of ARF, arfpy, is available on PyPI. For the development version, see here.

References

  • Watson, D. S., Blesch, K., Kapar, J. & Wright, M. N. (2023). Adversarial random forests for density estimation and generative modeling. In Proceedings of the 26th International Conference on Artificial Intelligence and Statistics. Link here.
  • Blesch, K., Koenen, N., Kapar, J., Golchian, P., Burk, L., Loecher, M. & Wright, M. N. (2025). Conditional feature importance with generative modeling using adversarial random forests. In Proceedings of the 39th AAAI Conference on Artificial Intelligence. Link here.

Copy Link

Version

Install

install.packages('arf')

Monthly Downloads

366

Version

0.2.4

License

GPL (>= 3)

Issues

Pull Requests

Stars

Forks

Maintainer

Marvin Wright

Last Published

February 24th, 2025

Functions in arf (0.2.4)

col_rename_all

Rename all problematic columns with col_rename().
darf

Shortcut likelihood function
impute

Missing value imputation with ARF
adversarial_rf

Adversarial Random Forests
earf

Shortcut expectation function
expct

Expected Value
lik

Likelihood Estimation
post_x

Post-process data
prep_x

Preprocess input data
rarf

Shortcut sampling function
forge

Forests for Generative Modeling
cforde

Compute conditional circuit parameters
forde

Forests for Density Estimation
arf-package

arf: Adversarial Random Forests
prep_cond

Preprocess conditions
resample

Safer version of sample()
col_rename

Adaptive column renaming
which.max.random

which.max() with random at ties