npc: Calculate the Neyman-Pearson Classifier from a sample of class 0 and class 1.

Description

npc calculate the Neyman-Pearson Classifier for a given type I error constraint.

Usage

npc(x = NULL, y, method = c("logistic", "penlog", "svm", "randomforest",
  "lda", "nb", "ada", "custom"), score = NULL, pred.score = NULL,
  alpha = 0.05, delta = 0.05, split = TRUE, loc.prob = NULL,
  n.cores = 1)

Arguments

n * p observation matrix. n observations, p covariates.

n 0/1 observatons.

method

classification method.

logistic:glmfunction with family = 'binomial'
penlog:glmnetinglmnetpackage
svm:

score

score vector corresponding to y. Required when method = 'custom'.

pred.score

predicted score vector for the test sample. Optional when method = 'custom'.

alpha

the desirable control on type I error. Default = 0.05.

delta

the violation rate of the type I error. Default = 0.05.

split

whether the class 0 sample is split into two parts. Default = TRUE. When method = 'custom', split = FALSE always.

loc.prob

the precalculated threshold locations in probability. Default = NULL.

n.cores

number of cores used for parallel computing. Default = 1.

Value

An object with S3 class npc.
fitfit the fit from the specified classifier.
scorethe score vector for each observation.
cutoffthecutoff determined via bootstrap to achieve the specified type I error control.
signwhether class 1 has a larger average score than the class 0.
methodthe method of the classifier.
loc.probthe percentile used to determine the cutoff for the specified type I error control.

Examples

Run this code

n = 1000
x = matrix(rnorm(n*2),n,2)
c = 1+3*x[,1]
y = rbinom(n,1,1/(1+exp(-c)))
xtest = matrix(rnorm(n*2),n,2)
ctest = 1+3*xtest[,1]
ytest = rbinom(n,1,1/(1+exp(-ctest)))

##Use svm classifier and the default type I error control with alpha=0.05
fit = npc(x, y, method = 'svm')
pred = predict(fit,xtest)
fit.score = predict(fit,x)
accuracy = mean(pred$pred.label==ytest)
cat('Overall Accuracy: ',  accuracy,'\n')
ind0 = which(ytest==0)
typeI = mean(pred$pred.label[ind0]!=ytest[ind0]) #type I error on test set
cat('Type I error: ', typeI, '\n')

##Now, change the method to logistic regression and change alpha to 0.1
fit = npc(x, y, method = 'logistic', alpha = 0.1)
pred = predict(fit,xtest)
accuracy = mean(pred$pred.label==ytest)
cat('Overall Accuracy: ',  accuracy,'\n')
ind0 = which(ytest==0)
typeI = mean(pred$pred.label[ind0]!=ytest[ind0]) #type I error on test set
cat('Type I error: ', typeI, '\n')

##Now, change the method to adaboost
#fit = npc(x, y, method = 'ada', alpha = 0.1)
#pred = predict(fit,xtest)
#accuracy = mean(pred$pred.label==ytest)
#cat('Overall Accuracy: ',  accuracy,'\\n')
#ind0 = which(ytest==0)
#typeI = mean(pred$pred.label[ind0]!=ytest[ind0]) #type I error on test set
#cat('Type I error: ', typeI, '\\n')

##A 'custom' npc classifier with y and score.
#fit2 = npc(y = y, score = fit.score$pred.score,
#pred.score = pred$pred.score, method = 'custom')

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

See Also

Examples