plot.logreg: Plots for Logic Regression

Description

Makes plots for objects fitted by logreg.

Usage

# S3 method for logreg
plot(x, pscript=FALSE, title=TRUE, ...)

Arguments

object of class logreg, typically the result of the function logreg.

pscript

if TRUE all plots will be stored in postscript files with distinct names.

title

if TRUE this generates a title for some plots, typically listing the number of trees and the model size.

...

graphical parameters can be given as arguments to plot.

Value

The type of the plots generated depends on the value of x$select.

if select = 1 the fitted trees for the results of find the best scoring model of any size are plotted;

if select = 2 or select = 6 the fitted trees are plotted, as well as a graph of the scores for various model sizes versus model size for the results of find the best scoring models for various sizes, or fit a sequence of logic regression models using a stepwise greedy algorithm;

if select = 3 training and test set scores as a function of model size for the results of carry out cross-validation for model selection are plotted;

if select = 4 a histogram of the permutation scores with various important values highlighted for the results of carry out a permutation test to check for signal in the data are plotted;

if select = 5 a series of histograms of the permutation scores with various important values highlighted for the results of carry out a permutation test for model selection are plotted;

if select = 7 a histogram of the size frequency of the fitted models, a histogram of the frequency that predictors are marginally in the model, and (if that information was collected) an image plot for the observed frequency that predictors were jointly in the model, and an image plot of the observed/expected ratio of that joint frequency. See Kooperberg and Ruczinski (2004) for the definition of the expected frequency.

References

Ruczinski I, Kooperberg C, LeBlanc ML (2003). Logic Regression, Journal of Computational and Graphical Statistics, 12, 475-511.

Ruczinski I, Kooperberg C, LeBlanc ML (2002). Logic Regression - methods and software. Proceedings of the MSRI workshop on Nonlinear Estimation and Classification (Eds: D. Denison, M. Hansen, C. Holmes, B. Mallick, B. Yu), Springer: New York, 333-344.

Kooperberg C, Ruczinki I (2005). Identifying interacting SNPs using Monte Carlo Logic Regression, Genetic Epidemiology, 28, 157-170.

Selected chapters from the dissertation of Ingo Ruczinski, available from http://kooperberg.fhcrc.org/logic/documents/ingophd-logic.pdf

Examples

Run this code

# NOT RUN {
data(logreg.savefit1,logreg.savefit2,logreg.savefit3,logreg.savefit4,
     logreg.savefit5,logreg.savefit6,logreg.savefit7)
#
# fit a single model
# myanneal <- logreg.anneal.control(start = -1, end = -4, iter = 25000, update = 1000)
# logreg.savefit1 <- logreg(resp = logreg.testdat[,1], bin=logreg.testdat[, 2:21],
#                type = 2, select = 1, ntrees = 2, anneal.control = myanneal)
# the best score should be in the 0.96-0.98 range
plot(logreg.savefit1)
#
# fit multiple models
# myanneal2 <- logreg.anneal.control(start = -1, end = -4, iter = 25000, update = 0)
# logreg.savefit2 <- logreg(select = 2, ntrees = c(1,2), nleaves =c(1,7), 
#                oldfit = logreg.savefit1, anneal.control = myanneal2)
plot(logreg.savefit2)
# After an initial steep decline, the scores only get slightly better
# for models with more than four leaves and two trees. 
#
# cross validation
# logreg.savefit3 <- logreg(select = 3, oldfit = logreg.savefit2)
plot(logreg.savefit3)
# 4 leaves, 2 trees should give the best test set score
#
# null model test
# logreg.savefit4 <- logreg(select = 4, anneal.control = myanneal2, oldfit = logreg.savefit1)
plot(logreg.savefit4)
# A histogram of the 25 scores obtained from the permutation test. Also shown
# are the scores for the best scoring model with one logic tree, and the null
# model (no tree). As the permutation scores are not even close to the score
# of the best model with one tree (fit on the original data), there is strong
# evidence against the null hypothesis that there was no signal in the data. 
#
# Permutation tests
# logreg.savefit5 <- logreg(select = 5, oldfit = logreg.savefit2)
plot(logreg.savefit5)
# The permutation scores improve until we condition on a model with two
# trees and four leaves, and then do not change very much anymore. This 
# indicates that the best model has indeed four leaves.
#
# a greedy sequence
# logreg.savefit6 <- logreg(select = 6, ntrees = 2, nleaves =c(1,12), oldfit = logreg.savefit1)
plot(logreg.savefit6)
# Monte Carlo Logic Regression
# logreg.savefit7 <- logreg(select = 7, oldfit = logreg.savefit1, mc.control=
#                logreg.mc.control(nburn=1000, niter=100000, hyperpars=log(2)))
plot(logreg.savefit7)
# }

Run the code above in your browser using DataLab