Learn R Programming

fda.usc (version 1.2.3)

classif.DD: DD-Classifier Based on DD-plot

Description

Fits Nonparametric Classification Procedure Based on DD--plot (depth-versus-depth plot) for G dimensions ($G=g\times h$, g levels and p data depth).

Usage

classif.DD(group,fdataobj,depth="FM",classif="glm",w,
           par.classif=list(),par.depth=list(),
           control=list(verbose=FALSE,draw=TRUE,col=NULL,alpha=.25))

Arguments

group
Factor of length n with g levels.
fdataobj
data.frame, fdata or list with the multivariate, functional or both covariates respectively.
depth
Character vector specifying the type of depth functions to use, see Details.
classif
Character vector specifying the type of classifier method to use, see Details.
w
Optional case weights, weights for each value of depth argument, see Details.
par.depth
List of parameters for depth function.
par.classif
List of parameters for classif procedure.
control
List of parameters for controlling the process. If verbose=TRUE, report extra information on progress. If draw=TRUE print DD-plot of two samples based on data depth. col, the colors for points in DD

Value

  • group.estEstimated vector groups by classified method selected.
  • misclassificationProbability of misclassification.
  • prob.classificationProbability of correct classification by group level.
  • depData frame with the depth of the curves for functional data (or points for multivariate data) in fdataobj w.r.t. each group level.
  • depthCharacter vector specifying the type of depth functions used.
  • par.depthList of parameters for depth function.
  • classifType of classifier used.
  • par.classifList of parameters for classif procedure.
  • wOptional case weights.
  • fitFitted object by classif method using the depth as covariate.

Details

Make the group classification of a training dataset using DD-classifier estimation in the following steps.
  1. The function computes the selecteddepthmeasure of the points infdataobjw.r.t. a subsample of each g level group and p data dimension ($G=g \times p$). The user can be specify the parameters for depth function inpar.depth. (i) Type of depth function from functional data, seeDepth:
    • "FM": Fraiman and Muniz depth.
    • "mode": h--modal depth. % \item \code{"MB"}: Modified Band depth.
    • "RT": random Tukey depth.
    • "RP": random project depth.
    • "RPD": double random project depth.
    (ii) Type of depth function from multivariate functional data, seeDepth.pfdata:
    • "FMp": Fraiman and Muniz depth with common support. Suppose that all p--fdata objects have the same support (same rangevals), seedepth.FMp.
    • "modep": h--modal depth using a p--dimensional metric, seedepth.modep. %\code{\link{metric.dist}} function is used to compute the distances between the rows of a data matrix (as \code{\link{dist}} function.
    • "RPp": random project depth using a p--variate depth with the projections, seedepth.RPp.
    If the procedure requires to compute a distance such as in"knn"or"np"classifier or"mode"depth, the user must use a proper distance function:metric.lpfor functional data andmetric.distfor multivariate data. (iii) Type of depth function from multivariate data, seeDepth.Multivariate:
    • "SD": Simplicial depth (for bivariate data).
    • "HS": Half-space depth.
    • "MhD": Mahalanobis dept.
    • "RD": random projections depth.
    • "LD": Likelihood depth.
  2. The function calculates the misclassification rate based on data depth computed in step (1) using the following classifiers.
    • "MaxD": Maximum depth.
    • "DD1": Search the best separating polynomial of degree 1.
    • "DD2": Search the best separating polynomial of degree 2.%Polynomial classifier with 2 degrees.
    • "DD3": Search the best separating polynomial of degree 3.%Polynomial classifier with 3 degrees.
    • "glm": Logistic regression is computed using Generalized Linear Models classif.glm.
    • "gam": Logistic regression is computed using Generalized Additive Models classif.gsam.
    • "lda": Linear Discriminant Analysis is computed usinglda.
    • "qda": Quadratic Discriminant Analysis is computed usingqda.
    • "knn": k-Nearest Neighbour classification is computed usingclassif.knn.
    • "np": Non-parametric Kernel classifier is computed usingclassif.np.
    The user can be specify the parameters for classifier function inpar.classifsuch as the smoothing parameterpar.classif[[``h'']], ifclassif="np"or the k-Nearest Neighbourpar.classif[[``knn'']], ifclassif="knn". In the case of polynomial classifier ("DD1","DD2"and"DD3") uses the original procedure proposed by Li et al. (2012), by defalut rotating the DD-plot (to exchange abscise and ordinate) using inpar.classifargumentrotate=TRUE. Notice that the maximum depth classifier can be considered as a particular case of DD1, fixing the slope with a value of 1 (par.classif=list(pol=1)). The number of possible different polynomials depends on the sample sizenand increases polynomially with order$k$. In the case of$g$groups, so the procedure applies some multiple-start optimization scheme to save time:
    • generate all combinations of the elements of n taken k at a time:$g \times combn(N,k)$candidate solutions, and, when this number is larger thannmax=10000, a random sample of10000combinations.
    • smooth the empirical loss with the logistic function$1/(1+e^{-tx})$. The classification rule is constructed optimizing the bestnoptimcombinations in this random sample (by defaultnoptim=1andtt=50/range(depth values)). Note that Li et al. found that the optimization results become stable for$t \in [50, 200]$when the depth is standardized with upper bound 1.
    The original procedure (Li et al. (2012)) not need to try many initial polynomials (nmax=1000) and that the procedure optimize the best (noptim=1), but we recommended to repeat the last step for different solutions, as for examplenmax=250andnoptim=25. User can change the parameterspol,rotate,nmax,noptimandttin the argumentpar.classif. Theclassif.DDprocedure extends to multi-class problems by incorporating the method ofmajority votingin the case of polynomial classifier and the methodOne vs the Restin the logistic case ("glm"and"gam").

References

Li, J., P.C., Cuesta-Albertos, J.A. and Liu, R. DD--Classifier: Nonparametric Classification Procedure Based on DD-plot. Journal of the American Statistical Association (2012), Vol. 107, 737--753. Cuesta-Albertos, J.A., Febrero-Bande, M. and Oviedo de la Fuente, M. The DDG-classifier in the functional setting. Submitted.

See Also

See Also as predict.classif.DD

Examples

Run this code
# DD-classif for functional data
data(tecator)
ab=tecator$absorp.fdata
ab1=fdata.deriv(ab,nderiv=1)
ab2=fdata.deriv(ab,nderiv=2)
gfat=factor(as.numeric(tecator$y$Fat>=15))

# DD-classif for p=1 functional  data set
out01=classif.DD(gfat,ab,depth="mode",classif="np")
out02=classif.DD(gfat,ab2,depth="mode",classif="np")
# DD-plot in gray scale
ctrl<-list(draw=T,col=gray(c(0,.5)),alpha=.2)
out02bis=classif.DD(gfat,ab2,depth="mode",classif="np",control=ctrl)

# 2 depth functions (same curves) 
out03=classif.DD(gfat,list(ab2,ab2),depth=c("RP","mode"),classif="np")
# DD-classif for p=2 functional data set
ldata<-list("ab"=ab2,"ab2"=ab2)
# Weighted version 
out04=classif.DD(gfat,ldata,depth="mode",classif="np",w=c(0.5,0.5))
# Model version
out05=classif.DD(gfat,ldata,depth="mode",classif="np")
# Integrated version (for multivariate functional data)
out06=classif.DD(gfat,ldata,depth="modep",classif="np")

# DD-classif for multivariate data
data(iris)
group<-iris[,5]
x<-iris[,1:4]
out10=classif.DD(group,x,depth="RP",classif="lda")
summary.classif(out10)
out11=classif.DD(group,list(x,x),depth=c("MhD","RP"),classif="lda")
summary.classif(out11)

# DD-classif for functional data: g levels 
data(phoneme)
mlearn<-phoneme[["learn"]]
glearn<-as.numeric(phoneme[["classlearn"]])-1
out20=classif.DD(glearn,mlearn,depth="FM",classif="glm")
out21=classif.DD(glearn,list(mlearn,mlearn),depth=c("FM","RP"),classif="glm")
summary.classif(out20)
summary.classif(out21)

Run the code above in your browser using DataLab