Learn R Programming

varSelRF (version 0.7-9)

selProbPlot: Selection probability plot for variable importance from random forests

Description

Plot, for the top ranked \(k\) variables from the original sample, the probability that each of these variables is included among the top ranked \(k\) genes from the bootstrap samples.

Usage

selProbPlot(object, k = c(20, 100),
            color = TRUE,
            legend = FALSE,
            xlegend = 68,
            ylegend = 0.93,
            cexlegend = 1.4,
            main = NULL,
            xlab = "Rank of gene",
            ylab = "Selection probability",
            pch = 19, ...)

Value

Used for its side effects of producing a plot. In a single plot show the "selection probability plot" for the upper (largest variable importance) ktth variables. By default, show the upper 20 and the upper 100 colored blue and red respectively.

Arguments

object

An object of class varSelRFBoot such as returned by the varSelRFBoot function.

k

A two-component vector with the \(k\)-th upper variables for which you want the plots.

color

If TRUE a color plot; if FALSE, black and white.

legend

If TRUE, show a legend.

xlegend

The x-coordinate for the legend.

ylegend

The y-coordinate for the legend.

cexlegend

The cex argument for the legend.

main

main for the plot.

xlab

xlab for the plot.

ylab

ylab for the plot.

pch

pch for the plot.

...

Additional arguments to plot.

Author

Ramon Diaz-Uriarte rdiaz02@gmail.com

Details

Pepe et al., 2003 suggested the use of selection probability plots to evaluate the stability and confidence on our selection of "relevant genes." This paper also presents several more sophisticated ideas not implemented here.

References

Breiman, L. (2001) Random forests. Machine Learning, 45, 5--32.

Diaz-Uriarte, R. , Alvarez de Andres, S. (2006) Gene selection and classification of microarray data using random forest. BMC Bioinformatics, 7, tools:::Rd_expr_doi("10.1186/1471-2105-7-3")

Pepe, M. S., Longton, G., Anderson, G. L. & Schummer, M. (2003) Selecting differentially expressed genes from microarray experiments. Biometrics, 59, 133--142.

Svetnik, V., Liaw, A. , Tong, C & Wang, T. (2004) Application of Breiman's random forest to modeling structure-activity relationships of pharmaceutical molecules. Pp. 334-343 in F. Roli, J. Kittler, and T. Windeatt (eds.). Multiple Classier Systems, Fifth International Workshop, MCS 2004, Proceedings, 9-11 June 2004, Cagliari, Italy. Lecture Notes in Computer Science, vol. 3077. Berlin: Springer.

See Also

randomForest, varSelRF, varSelRFBoot, randomVarImpsRFplot, randomVarImpsRF

Examples

Run this code
## This is a small example, but can take some time.

x <- matrix(rnorm(25 * 15), ncol = 15)
x[1:10, 1:2] <- x[1:10, 1:2] + 2
cl <- factor(c(rep("A", 10), rep("B", 15)))

rf.vs1 <- varSelRF(x, cl, ntree = 200, ntreeIterat = 100,
                   vars.drop.frac = 0.2)
rf.vsb <- varSelRFBoot(x, cl,
                       bootnumber = 10,
                       usingCluster = FALSE,
                       srf = rf.vs1)
selProbPlot(rf.vsb, k = c(5, 10), legend = TRUE,
            xlegend = 8, ylegend = 0.8)

Run the code above in your browser using DataLab