Learn R Programming

RFlocalfdr

Provides a method for setting the significance level of the MDI (mean decrease in impurity) importances from a random forest model. Based on an empirical Bayes model. See https://www.biorxiv.org/content/10.1101/2022.04.06.487300v2 Thresholding Gini Variable Importance with a single trained Random Forest: An Empirical Bayes Approach (Robert Dunne, Roc Reguant, Priya Ramarao-Milne, Piotr Szul, Letitia Sng, Mischa Lundberg, Natalie A. Twine, Denis C. Bauer) for full details.

Until I figure out how to manage the cran repository:

  • the data sets are not available in the cran version
  • many of the examples are enclosed in "dontrun" environments

Install devtools from CRAN

install.packages("RFlocalfdr")

Or from GitHub:

devtools::install_github("parsifal9/RFlocalfdr", build_vignettes = TRUE)

Usage

library(RFlocalfdr)
vignette("simulated",package="RFlocalfdr")
vignette("Smoking",package="RFlocalfdr")

License

GNU General Public License

Copy Link

Version

Install

install.packages('RFlocalfdr')

Monthly Downloads

147

Version

0.9

License

GPL (>= 3)

Maintainer

Robert Dunne

Last Published

January 30th, 2025

Functions in RFlocalfdr (0.9)

my_PIMP

my_PIMP based on the PIMP function from the vita package. ‘PIMP’ implements the test approach of Altmann et al. (2010) for the permutation variable importance measure ‘VarImp’ returned by the randomForest package (Liaw and Wiener (2002)) for classification and regression.
significant.genes

significant.genes
my_ranger_PIMP

my_ranger_PIMP based on the PIMP function from the vita package. ‘PIMP’ implements the test approach of Altmann et al. (2010) for the permutation variable importance measure ‘VarImp’ returned by the randomForest package (Liaw and Wiener (2002)) for classification and regression.
propTrueNullByLocalFDR

propTrueNullByLocalFDR
run.it.importances

run.it.importances
plotQ

plotQ
my.dsn

my.dsn
fit.to.data.set.wrapper

fit.to.data.set.wrapper
determine.C

determine.C
local.fdr

local fdr
fit.to.data.set

fit.to.data.set
determine_cutoff

evaluate a measure that can be used to determining a significance level for the Mean Decrease in Impurity measure returned by a Random Forest model
f.fit

fit a spline to the histogram of imp
count_variables

count the number of times each variable is used in a ranger random forest
imp20000

20000 importance values
my.test1fun

my.test1fun