Bootstrap-crossvalidation techniques are implemented to estimate the generalization performance of the model(s), i.e. the performance which can be expected in new subjects.
Roc(object,...)
## S3 method for class 'list':
Roc(object,
formula,
data,
splitMethod="noSplitMethod",
noinf.method=c("simulate"),
simulate="reeval",
B,
M,
breaks,
crRatio=1,
RocAverageMethod="vertical",
RocAverageGrid=switch(RocAverageMethod,
"vertical"=seq(0,1,.01),
"horizontal"=seq(1,0,-.01)),
model.args=NULL,
model.parms=NULL,
keepModels=FALSE,
keepSampleIndex=FALSE,
keepCrossValRes=FALSE,
keepNoInfSimu,
slaveseed,
na.accept=0,
verbose=TRUE,
...)
Brier(object,...)
## S3 method for class 'list':
Brier(object,
formula,
data,
splitMethod="noSplitMethod",
noinf.method=c("simulate"),
simulate="reeval",
crRatio=1,
B,
M,
model.args=NULL,
model.parms=NULL,
keepModels=FALSE,
keepSampleIndex=FALSE,
keepCrossValRes=FALSE,
na.accept=0,
verbose=TRUE,
...)
## S3 method for class 'glm':
Brier(object,formula,data,...)
## S3 method for class 'lrm':
Brier(object,formula,data,...)
## S3 method for class 'rpart':
Brier(object,formula,data,...)
## S3 method for class 'randomForest':
Brier(object,formula,data,...)
## S3 method for class 'glm':
Roc(object,formula,data,...)
## S3 method for class 'lrm':
Roc(object,formula,data,...)
## S3 method for class 'rpart':
Roc(object,formula,data,...)
## S3 method for class 'randomForest':
Roc(object,formula,data,...)call that can be evaluated to such an R-object.
For cdata. none:Assess the models in the same data where they are
fitted. Yields the apparent or re-substitution performance. Often
overestimates the generalization performance.
"reeval" then the models are re-evaluated in the current
permuted data for computing the no-information Roc curve.splitMethod.
When splitMethod in c("Bootcv","Boot632","Boot632plus" the
default is 100. For splitMethod="cvK" B is the number of crossseq(0,1,.01) for the Roc.list method and to
sort(unique(breaks)) for the default method."vertical" or
"horizontal". See reference 1 below for details.predictStatusProb methods. The list must have
an entry for each entry in object.object.
Each entry names parts of the value of the fitted models that should
be extracted and added to the output (see value).FALSE keep only the names of the elements
of object.
If "Call" then keep the call of the elements of object.
Else, add the object as it is to the output.FALSE remove the cross-validation
index (which tells who was in the learn and who in the validation
set)
from the output list which otherwise is included in the method
part of the output list.TRUE add all B crossvalidation results to
the output (see value). Defaults to TRUE.TRUE add the B results in permuted data
(for no-information performance) to
the output (see value). Defaults to FALSE.B, to be given to the slaves in parallel computing.B.TRUE the procedure is reporting details of
the progress, e.g. it prints the current step in cross-validation
procedures.Roc or class Brier for which
print.Roc, summary.Roc,
and plot.Roc (only Roc) methods are available.
The object includes the following components:splitMethod. A matrix where each column represents the
estimated prediction error of a fit at
the time points in time.splitMethod is one of
"NoInf", "Bootcv", "Boot632" or "Boot632plus", since otherwise
repla is "apparent" and then this is
stored in Roc as explained above.splitMethod is one of
"Boot632" or "Boot632plus". When splitMethod="Bootcv"
then the BootcvRoc is stored in the component PredRoc.splitMethod is one of
"Bootcv", "Boot632", or "Boot632plus".
For splitMethod="NoInf"
the NoInfRoc is stored in the component PredRoc.AppRoc and the
BootcvRoc
Only if splitMethod is one of
"Boot632", or "Boot632plus".overfit of the model(s).
See references.
Only if splitMethod is one of
"Boot632", or "Boot632plus".predictStatusProb method: for example, to assess a
prediction model which evaluates to a myclass object one defines
a function called predictStatusProb.myclass with arguments
object,newdata,cutpoints,train.data,..., like thismyFit=myModel(Y~X,data=dat) class(myFit)="myclass"
predictStatusProb.myclass <-
function(object,newdata,cutpoints,train.data,...){
predict(object, data=newdata,method="probabilities")
out
}
Such a function takes the object which was fitted with train.data and
derives a matrix with predicted event status probabilities for each subject
in newdata (rows) and each cutpoint (column) of the response variable
that defines an event status.
Currently implemented are predictStatusProb methods for the following R-functions:
[object Object],[object Object],[object Object],[object Object]
Efron, Tibshirani (1997) Journal of the American Statistical Association 92, 548--560 Improvement On Cross-Validation: The .632+ Bootstrap Method.
Wehberg, S and Schumacher, M (2004) A comparison of nonparametric error rate estimation methods in classification problems Biometrical Journal, Vol 46, 35--47
## Generate some data with binary response Y
## depending on X1 and X2 and X1*X2
N <- 400
X1 <- rnorm(N)
X2 <- rbinom(N,1,.4)
expit <- function(x) exp(x)/(1+exp(x))
lp <- expit(1 + X1 + X2 - X1*X2)
Y <- factor(rbinom(N,1,lp))
dat <- data.frame(Y=Y,X1=X1,X2=X2)
## fit a logistic model
lm1 <- glm(Y~X1,data=dat,family="binomial")
lm2 <- glm(Y~X1+X2,data=dat,family="binomial")
r1=Roc(list(lm1,lm2),verbose=0,crRatio=1)
summary(r1)
Brier(list(lm1,lm2),verbose=0,crRatio=1)
# crossing curves
set.seed(18)
N=500
Y=rbinom(N,1,.5)
X1=rnorm(N)
X1[Y==1]=rnorm(sum(Y==1),mean=rbinom(sum(Y==1),1,.5))
X2=rnorm(N)
X2[Y==0]=rnorm(sum(Y==0),mean=rbinom(sum(Y==0),1,.5))
dat <- data.frame(Y=Y,X1=X1,X2=X2)
lm1 <- glm(Y~X1,data=dat,family="binomial")
lm2 <- glm(Y~X2,data=dat,family="binomial")
plot(Roc(list(lm1,lm2),data=dat,verbose=0,crRatio=1))
library(randomForest)
dat$Y=factor(dat$Y)
rf <- randomForest(Y~X2,data=dat)
rocCV=Roc(list(RandomForest=rf,LogisticRegression=lm2),
data=dat,
verbose=TRUE,
splitMethod="bootcv",
B=10,
crRatio=1)
plot(rocCV,diag=TRUE)Run the code above in your browser using DataLab