Learn R Programming

mt (version 2.0-1.20)

maccest: Estimation of Multiple Classification Accuracy

Description

Estimation of classification accuracy by multiple classifiers with resampling procedure and comparisons of multiple classifiers.

Usage

maccest(dat, ...)
# S3 method for default
maccest(dat, cl, method="svm", pars = valipars(), 
        tr.idx = NULL, comp="anova",...) 
# S3 method for formula
maccest(formula, data = NULL, ..., subset, na.action = na.omit)

Value

An object of class maccest, including the components:

method

Classification method used.

acc

Accuracy rate.

acc.iter

Accuracy rate of each iteration.

acc.std

Standard derivation of accuracy rate.

mar

Prediction margin.

mar.iter

Prediction margin of each iteration.

auc

The area under receiver operating curve (AUC).

auc.iter

AUC of each iteration.

comp

Multiple comparison method used.

h.test

Hypothesis test results of multiple comparison.

gl.pval

Global or overall p-value.

mc.pval

Pairwise comparison p-values.

sampling

Sampling scheme used.

niter

Number of iteration.

nreps

Number of replications in each iteration.

conf.mat

Overall confusion matrix.

acc.boot

A list of bootrap error such as .632 and .632+ if the validation method is bootrap.

Arguments

formula

A formula of the form groups ~ x1 + x2 + ... That is, the response is the grouping factor and the right hand side specifies the (non-factor) discriminators.

data

Data frame from which variables specified in formula are preferentially to be taken.

dat

A matrix or data frame containing the explanatory variables if no formula is given as the principal argument.

cl

A factor specifying the class for each observation if no formula principal argument is given.

method

A vector of multiple classification methods to be used. Classifiers, such as randomForest, svm, knn and lda, can be used. For details, see note below.

pars

A list of resampling scheme such as Leave-one-out cross-validation, Cross-validation, Randomised validation (holdout) and Bootstrap, and control parameters for the calculation of accuracy. See valipars for details.

tr.idx

User defined index of training samples. Can be generated by trainind.

comp

Comparison method of multiple classifier. If comp is anova, the multiple comparisons are performed by ANOVA and then the pairwise comparisons are performed by HSDTukey. If comp is fried, the multiple comparisons are performed by Friedman Test and the pairwise comparisons are performed by Wilcoxon Test.

...

Additional parameters to method.

subset

Optional vector, specifying a subset of observations to be used.

na.action

Function which indicates what should happen when the data contains NA's, defaults to na.omit.

Author

Wanchang Lin

Details

The accuracy rates for classification are obtained used techniques such as Random Forest, Support Vector Machine, k-Nearest Neighbour Classification, Linear Discriminant Analysis and Linear Discriminant Analysis based on sampling methods, including Leave-one-out cross-validation, Cross-validation, Randomised validation (holdout) and Bootstrap.

See Also

accest, aam.mcl, valipars, plot.maccest trainind, boxplot.maccest,classifier

Examples

Run this code
# Iris data
data(iris)
x      <- subset(iris, select = -Species)
y      <- iris$Species

method <- c("randomForest","svm","pcalda","knn")
pars   <- valipars(sampling="boot", niter = 3, nreps=5, strat=TRUE)
res    <- maccest(Species~., data = iris, method=method, pars = pars, 
                  comp="anova")
## or 
res    <- maccest(x, y, method=method, pars=pars, comp="anova") 

res
summary(res)
plot(res)
boxplot(res)
oldpar <- par(mar = c(5,10,4,2) + 0.1)
plot(res$h.test$tukey,las=1)   ## plot the tukey results
par(oldpar)

Run the code above in your browser using DataLab