This function presents visual graphics by means of Multiple correspondence Analysis projection. Interval variables are categorized to bins. Dependent classification variable is set as supplementary variable. Machine learning algorithm predictions are presented in a filled contour setting.
mcacontour(dataf=dataf,listconti,listclass,vardep,proba="",bins=8,
Dime1="Dim.1",Dime2="Dim.2",classvar=1,intergrid=0,selec=0,
title="",title2="",listacol="",depcol="",alpha1=0.8,alpha2=0.8,alpha3=0.7,modelo="glm",
nodos=3,maxit=200,decay=0.01,sampsize=400,mtry=2,nodesize=5,
ntree=400,ntreegbm=500,shrink=0.01,bag.fraction=1,n.minobsinnode=10,C=100,gamma=10)A list with the following objects:
plot of points on MCA two dimensions
plot of points and variables
plot of points and contour curves
plot of points, contour curves and variables
plot of points colored by fitted probability
plot of points colored by abs difference
dataset used for graph1
dataset used for graph2
dataset used for graph3
dataset used for graph4
interval variables used
class variables used
color schemes and other parameters
data frame.
Interval variables to use, in format c("var1","var2",...).
Class variables to use, in format c("var1","var2",...).
Dependent binary classification variable.
vector of probability predictions obtained externally (optional)
Number of bins for categorize interval variables .
FAMD Dimensions to consider. Dim.1 and Dim.2 by default.
1 if dependent variable categories are plotted as supplementary
scale of grid for contour:0 if automatic
1 if stepwise logistic variable selection is required, 0 if not.
plot main title
plot subtitle
vector of colors for labels
vector of two colors for points
alpha transparency for majoritary class
alpha transparency for minoritary class
alpha transparency for fit probability plots
name of model: "glm","gbm","rf,","nnet","svm".
nnet: nodes
nnet: iterations
nnet: decay
rf: sampsize
rf: mtry
rf: nodesize
rf: ntree
gbm: ntree
gbm: shrink
gbm: bag.fraction
gbm:n.minobsinnode
svm Radial: C
svm Radial: gamma
This function applies MCA (Multiple Correspondence Analysis) in order to project points and categories of class variables in the same plot. In addition, interval variables listed in listconti are categorized to the number given in bins parameter (by default 8 bins). Further explanation about machine learning classification and contour curves, see the famdcontour function documentation.
Check missings. Missing values are not allowed.
By default selec=0. Setting selec=1 may sometimes imply that no variables are selected; an error message is shown in this case.
Models with only two input variables could lead to plot generation problems.
Be sure that variables named in listconti are all numeric.
If some numeric variable is constant at one single value, process is stopped since numeric Min-max standarization is performed, and NaN values are generated.
Dependent variable can not be named x,y,z,x1,x2.
data(breastwisconsin1)
dataf<-breastwisconsin1
listconti=c( "clump_thickness","uniformity_of_cell_shape","mitosis")
listclass=c("")
vardep="classes"
result<-mcacontour(dataf=dataf,listconti,listclass,vardep)
Run the code above in your browser using DataLab