Bootstrap-crossvalidation techniques are implemented to estimate the generalization performance of the model(s), i.e. the performance which can be expected in new subjects.
Roc(object,...)
## S3 method for class 'list':
Roc(object,
formula,
data,
splitMethod="noSplitMethod",
noinf.method=c("simulate"),
simulate="reeval",
B,
M,
breaks,
crRatio=1,
RocAverageMethod="vertical",
RocAverageGrid=switch(RocAverageMethod,
"vertical"=seq(0,1,.01),
"horizontal"=seq(1,0,-.01)),
model.args=NULL,
model.parms=NULL,
keepModels=FALSE,
keepSampleIndex=FALSE,
keepCrossValRes=FALSE,
keepNoInfSimu,
slaveseed,
na.accept=0,
verbose=TRUE,
...)
Brier(object,...)
## S3 method for class 'list':
Brier(object,
formula,
data,
splitMethod="noSplitMethod",
noinf.method=c("simulate"),
simulate="reeval",
crRatio=1,
B,
M,
model.args=NULL,
model.parms=NULL,
keepModels=FALSE,
keepSampleIndex=FALSE,
keepCrossValRes=FALSE,
na.accept=0,
verbose=TRUE,
...)
## S3 method for class 'glm':
Brier(object,formula,data,...)
## S3 method for class 'lrm':
Brier(object,formula,data,...)
## S3 method for class 'rpart':
Brier(object,formula,data,...)
## S3 method for class 'randomForest':
Brier(object,formula,data,...)
## S3 method for class 'glm':
Roc(object,formula,data,...)
## S3 method for class 'lrm':
Roc(object,formula,data,...)
## S3 method for class 'rpart':
Roc(object,formula,data,...)
## S3 method for class 'randomForest':
Roc(object,formula,data,...)
call
that can be evaluated to such an R-object.
For cdata
. none
:Assess the models in the same data where they are
fitted. Yields the apparent or re-substitution performance. Often
overestimates the generalization performance.
"reeval"
then the models are re-evaluated in the current
permuted data for computing the no-information Roc curve.splitMethod
.
When splitMethod in c("Bootcv","Boot632","Boot632plus"
the
default is 100. For splitMethod="cvK"
B
is the number of crossseq(0,1,.01)
for the Roc.list method and to
sort(unique(breaks))
for the default method."vertical"
or
"horizontal"
. See reference 1 below for details.predictStatusProb
methods. The list must have
an entry for each entry in object
.object
.
Each entry names parts of the value of the fitted models that should
be extracted and added to the output (see value).FALSE
keep only the names of the elements
of object.
If "Call"
then keep the call of the elements of object.
Else, add the object as it is to the output.FALSE
remove the cross-validation
index (which tells who was in the learn and who in the validation
set)
from the output list which otherwise is included in the method
part of the output list.TRUE
add all B
crossvalidation results to
the output (see value). Defaults to TRUE
.TRUE
add the B
results in permuted data
(for no-information performance) to
the output (see value). Defaults to FALSE
.B
, to be given to the slaves in parallel computing.B
.TRUE
the procedure is reporting details of
the progress, e.g. it prints the current step in cross-validation
procedures.Roc
or class Brier
for which
print.Roc
, summary.Roc
,
and plot.Roc
(only Roc
) methods are available.
The object includes the following components:splitMethod
. A matrix where each column represents the
estimated prediction error of a fit at
the time points in time.splitMethod
is one of
"NoInf", "Bootcv", "Boot632" or "Boot632plus", since otherwise
repla
is "apparent" and then this is
stored in Roc
as explained above.splitMethod
is one of
"Boot632" or "Boot632plus". When splitMethod="Bootcv"
then the BootcvRoc
is stored in the component PredRoc
.splitMethod
is one of
"Bootcv", "Boot632", or "Boot632plus".
For splitMethod="NoInf"
the NoInfRoc
is stored in the component PredRoc
.AppRoc
and the
BootcvRoc
Only if splitMethod
is one of
"Boot632", or "Boot632plus".overfit
of the model(s).
See references.
Only if splitMethod
is one of
"Boot632", or "Boot632plus".predictStatusProb
method: for example, to assess a
prediction model which evaluates to a myclass
object one defines
a function called predictStatusProb.myclass
with arguments
object,newdata,cutpoints,train.data,...
, like thismyFit=myModel(Y~X,data=dat) class(myFit)="myclass"
predictStatusProb.myclass <-
function(object,newdata,cutpoints,train.data,...){
predict(object, data=newdata,method="probabilities")
out
}
Such a function takes the object which was fitted with train.data and
derives a matrix with predicted event status probabilities for each subject
in newdata (rows) and each cutpoint (column) of the response variable
that defines an event status.
Currently implemented are predictStatusProb
methods for the following R-functions:
[object Object],[object Object],[object Object],[object Object]
Efron, Tibshirani (1997) Journal of the American Statistical Association 92, 548--560 Improvement On Cross-Validation: The .632+ Bootstrap Method.
Wehberg, S and Schumacher, M (2004) A comparison of nonparametric error rate estimation methods in classification problems Biometrical Journal, Vol 46, 35--47
## Generate some data with binary response Y
## depending on X1 and X2 and X1*X2
N <- 400
X1 <- rnorm(N)
X2 <- rbinom(N,1,.4)
expit <- function(x) exp(x)/(1+exp(x))
lp <- expit(1 + X1 + X2 - X1*X2)
Y <- factor(rbinom(N,1,lp))
dat <- data.frame(Y=Y,X1=X1,X2=X2)
## fit a logistic model
lm1 <- glm(Y~X1,data=dat,family="binomial")
lm2 <- glm(Y~X1+X2,data=dat,family="binomial")
r1=Roc(list(lm1,lm2),verbose=0,crRatio=1)
summary(r1)
Brier(list(lm1,lm2),verbose=0,crRatio=1)
# crossing curves
set.seed(18)
N=500
Y=rbinom(N,1,.5)
X1=rnorm(N)
X1[Y==1]=rnorm(sum(Y==1),mean=rbinom(sum(Y==1),1,.5))
X2=rnorm(N)
X2[Y==0]=rnorm(sum(Y==0),mean=rbinom(sum(Y==0),1,.5))
dat <- data.frame(Y=Y,X1=X1,X2=X2)
lm1 <- glm(Y~X1,data=dat,family="binomial")
lm2 <- glm(Y~X2,data=dat,family="binomial")
plot(Roc(list(lm1,lm2),data=dat,verbose=0,crRatio=1))
library(randomForest)
dat$Y=factor(dat$Y)
rf <- randomForest(Y~X2,data=dat)
rocCV=Roc(list(RandomForest=rf,LogisticRegression=lm2),
data=dat,
verbose=TRUE,
splitMethod="bootcv",
B=10,
crRatio=1)
plot(rocCV,diag=TRUE)
Run the code above in your browser using DataLab