Learn R Programming

nproc (version 0.1)

npc: Calculate the Neyman-Pearson Classifier from a sample of class 0 and class 1.

Description

npc calculate the Neyman-Pearson Classifier for a given type I error constraint.

Usage

npc(x = NULL, y, method = c("logistic", "penlog", "svm", "randomforest",
  "lda", "nb", "ada", "custom"), score = NULL, pred.score = NULL,
  alpha = 0.05, delta = 0.05, split = TRUE, loc.prob = NULL,
  n.cores = 1)

Arguments

x
n * p observation matrix. n observations, p covariates.
y
n 0/1 observatons.
method
classification method.
  • logistic:glmfunction with family = 'binomial'
  • penlog:glmnetinglmnetpackage
  • svm:
score
score vector corresponding to y. Required when method = 'custom'.
pred.score
predicted score vector for the test sample. Optional when method = 'custom'.
alpha
the desirable control on type I error. Default = 0.05.
delta
the violation rate of the type I error. Default = 0.05.
split
whether the class 0 sample is split into two parts. Default = TRUE. When method = 'custom', split = FALSE always.
loc.prob
the precalculated threshold locations in probability. Default = NULL.
n.cores
number of cores used for parallel computing. Default = 1.

Value

  • An object with S3 class npc.
  • fitfit the fit from the specified classifier.
  • scorethe score vector for each observation.
  • cutoffthecutoff determined via bootstrap to achieve the specified type I error control.
  • signwhether class 1 has a larger average score than the class 0.
  • methodthe method of the classifier.
  • loc.probthe percentile used to determine the cutoff for the specified type I error control.

See Also

nproc and predict.npc

Examples

Run this code
n = 1000
x = matrix(rnorm(n*2),n,2)
c = 1+3*x[,1]
y = rbinom(n,1,1/(1+exp(-c)))
xtest = matrix(rnorm(n*2),n,2)
ctest = 1+3*xtest[,1]
ytest = rbinom(n,1,1/(1+exp(-ctest)))

##Use svm classifier and the default type I error control with alpha=0.05
fit = npc(x, y, method = 'svm')
pred = predict(fit,xtest)
fit.score = predict(fit,x)
accuracy = mean(pred$pred.label==ytest)
cat('Overall Accuracy: ',  accuracy,'\n')
ind0 = which(ytest==0)
typeI = mean(pred$pred.label[ind0]!=ytest[ind0]) #type I error on test set
cat('Type I error: ', typeI, '\n')

##Now, change the method to logistic regression and change alpha to 0.1
fit = npc(x, y, method = 'logistic', alpha = 0.1)
pred = predict(fit,xtest)
accuracy = mean(pred$pred.label==ytest)
cat('Overall Accuracy: ',  accuracy,'\n')
ind0 = which(ytest==0)
typeI = mean(pred$pred.label[ind0]!=ytest[ind0]) #type I error on test set
cat('Type I error: ', typeI, '\n')

##Now, change the method to adaboost
#fit = npc(x, y, method = 'ada', alpha = 0.1)
#pred = predict(fit,xtest)
#accuracy = mean(pred$pred.label==ytest)
#cat('Overall Accuracy: ',  accuracy,'\\n')
#ind0 = which(ytest==0)
#typeI = mean(pred$pred.label[ind0]!=ytest[ind0]) #type I error on test set
#cat('Type I error: ', typeI, '\\n')

##A 'custom' npc classifier with y and score.
#fit2 = npc(y = y, score = fit.score$pred.score,
#pred.score = pred$pred.score, method = 'custom')

Run the code above in your browser using DataLab