Learn R Programming

rminer (version 1.5.0)

mgraph: Mining graph function

Description

Plots a graph given a mining list, list of several mining lists or given the pair y - target and x - predictions.

Usage

mgraph(y, x = NULL, graph, leg = NULL, xval = -1, PDF = "", PTS = -1, 
       size = c(5, 5), sort = TRUE, ranges = NULL, data = NULL,
       digits = NULL, TC = -1, intbar = TRUE, lty = 1, col = "black",
       main = "", metric = "MAE", baseline = FALSE, Grid = 0, 
       axis = NULL, cex = 1)

Value

A graph (in screen or pdf file).

Arguments

y

if there are predictions (!is.null(x)), y should be a numeric vector or factor with the target desired responses (or output values).
Else, y should be a list returned by the mining function or a vector list with several mining lists.

x

the predictions (should be a numeric vector if task="reg", matrix if task="prob" or factor if task="class" (use if y is not a list).

graph

type of graph. Options are:

  • ROC -- ROC curve (classification);

  • LIFT -- LIFT accumulative curve (classification);

  • IMP -- relative input importance barplot;

  • REC -- REC curve (regression);

  • VEC -- variable effect curve;

  • RSC -- regression scatter plot;

  • REP -- regression error plot;

  • REG -- regression plot;

  • DLC -- distance line comparison (for comparing errors in one line);

leg

legend of graph:

  • if NULL -- not used;

  • if -1 and graph="ROC" or "LIFT" -- the target class name is used;

  • if -1 and graph="REG" -- leg=c("Target","Predictions");

  • if -1 and graph="RSC" -- leg=c("Predictions");

  • if vector with "character" type (text) -- the text of the legend;

  • if is list -- $leg = vector with the text of the legend and $pos is the position of the legend (e.g. "top" or c(4,5));

xval

auxiliary value, used by some graphs:

  • VEC -- if -1 means perform several 1-D sensitivity analysis VEC curves, one for each attribute, if >0 means the attribute index (e.g. 1).

  • ROC or LIFT or REC -- if -1 then xval=1. For these graphs, xval is the maximum x-axis value.

  • IMP -- xval is the x-axis value for the legend of the attributes.

  • REG -- xval is the set of plotted examples (e.g. 1:5), if -1 then all examples are used.

  • DLC -- xval is the val of the mmetric function.

PDF

if "" then the graph is plotted on the screen, else the graph is saved into a pdf file with the name set in this argument.

PTS

number of points in each line plot. If -1 then PTS=11 (for ROC, REC or LIFT) or PTS=6 (VEC).

size

size of the graph, c(width,height), in inches.

sort

if TRUE then sorts the data (works only for some graphs, e.g. VEC, IMP, REP).

ranges

matrix with the attribute minimum and maximum ranges (only used by VEC).

data

the training data, for plotting histograms and getting the minimum and maximum attribute ranges if not defined in ranges (only used by VEC).

digits

the number of digits for the axis, can also be defined as c(x-axis digits,y-axis digits) (only used by VEC).

TC

target class (for multi-class classification class) from 1 to Nc, where Nc is the number of classes. If multi-class and TC==-1 then TC is set to the index of the last class.

intbar

if 95% confidence interval bars (according to t-student distribution) should be plotted as whiskers.

lty

the same lty argument of the par function.

col

color, as defined in the par function.

main

the title of the graph, as defined in the plot function.

metric

the error metric, as defined in mmetric (used by DLC).

baseline

if the baseline should be plotted (used by ROC and LIFT).

Grid

if >1 then there are GRID light gray squared grid lines in the plot.

axis

Currently only used by IMP: numeric vector with the axis numbers (1 -- bottom, 3 -- top). If NULL then axis=c(1,3).

cex

label font size

Details

Plots a graph given a mining list, list of several mining lists or given the pair y - target and x - predictions.

References

  • To check for more details about rminer and for citation purposes:
    P. Cortez.
    Data Mining with Neural Networks and Support Vector Machines Using the R/rminer Tool.
    In P. Perner (Ed.), Advances in Data Mining - Applications and Theoretical Aspects 10th Industrial Conference on Data Mining (ICDM 2010), Lecture Notes in Artificial Intelligence 6171, pp. 572-583, Berlin, Germany, July, 2010. Springer. ISBN: 978-3-642-14399-1.
    @Springer: https://link.springer.com/chapter/10.1007/978-3-642-14400-4_44

  • This tutorial shows additional code examples:
    P. Cortez.
    A tutorial on using the rminer R package for data mining tasks.
    Teaching Report, Department of Information Systems, ALGORITMI Research Centre, Engineering School, University of Minho, Guimaraes, Portugal, July 2015.
    http://hdl.handle.net/1822/36210

See Also

fit, predict.fit, mining, mmetric, savemining and Importance.

Examples

Run this code
### regression
y=c(1,5,10,11,7,3,2,1);x=rnorm(length(y),0,1.0)+y
mgraph(y,x,graph="RSC",Grid=10,col=c("blue"))
mgraph(y,x,graph="REG",Grid=10,lty=1,col=c("black","blue"),
       leg=list(pos="topleft",leg=c("target","predictions")))
mgraph(y,x,graph="REP",Grid=10)
mgraph(y,x,graph="REP",Grid=10,sort=FALSE)
x2=rnorm(length(y),0,1.2)+y;x3=rnorm(length(y),0,1.4)+y;
L=vector("list",3); pred=vector("list",1); test=vector("list",1);
pred[[1]]=y; test[[1]]=x; L[[1]]=list(pred=pred,test=test,runs=1)
test[[1]]=x2; L[[2]]=list(pred=pred,test=test,runs=1)
test[[1]]=x3; L[[3]]=list(pred=pred,test=test,runs=1)
# distance line comparison graph:
mgraph(L,graph="DLC",metric="MAE",leg=c("x1","x2","x3"),main="MAE errors")

# new REC multi-curve single graph with NAREC (normalized Area of REC) values
# for maximum tolerance of val=0.5 (other val values can be used)
e1=mmetric(y,x,metric="NAREC",val=5)
e2=mmetric(y,x2,metric="NAREC",val=5)
e3=mmetric(y,x3,metric="NAREC",val=5)
l1=paste("x1, NAREC=",round(e1,digits=2))
l2=paste("x2, NAREC=",round(e2,digits=2))
l3=paste("x3, NAREC=",round(e3,digits=2))
mgraph(L,graph="REC",leg=list(pos="bottom",leg=c(l1,l2,l3)),main="REC curves")

### regression example with mining
if (FALSE) {
data(sin1reg)
M1=mining(y~.,sin1reg[,c(1,2,4)],model="mr",Runs=5)
M2=mining(y~.,sin1reg[,c(1,2,4)],model="mlpe",nr=3,maxit=50,size=4,Runs=5,feature="simp")
L=vector("list",2); L[[1]]=M2; L[[2]]=M1
mgraph(L,graph="REC",xval=0.1,leg=c("mlpe","mr"),main="REC curve")
mgraph(L,graph="DLC",metric="TOLERANCE",xval=0.01,
       leg=c("mlpe","mr"),main="DLC: TOLERANCE plot")
mgraph(M2,graph="IMP",xval=0.01,leg=c("x1","x2"),
       main="sin1reg Input importance",axis=1)
mgraph(M2,graph="VEC",xval=1,main="sin1reg 1-D VEC curve for x1")
mgraph(M2,graph="VEC",xval=1,
       main="sin1reg 1-D VEC curve and histogram for x1",data=sin1reg)
}

### classification example
if (FALSE) {
data(iris)
M1=mining(Species~.,iris,model="rpart",Runs=5) # decision tree (DT)
M2=mining(Species~.,iris,model="ksvm",Runs=5) # support vector machine (SVM)
L=vector("list",2); L[[1]]=M2; L[[2]]=M1
mgraph(M1,graph="ROC",TC=3,leg=-1,baseline=TRUE,Grid=10,main="ROC")
mgraph(M1,graph="ROC",TC=3,leg=-1,baseline=TRUE,Grid=10,main="ROC",intbar=FALSE)
mgraph(L,graph="ROC",TC=3,leg=c("SVM","DT"),baseline=TRUE,Grid=10,
       main="ROC for virginica")
mgraph(L,graph="LIFT",TC=3,leg=list(pos=c(0.4,0.2),leg=c("SVM","DT")),
       baseline=TRUE,Grid=10,main="LIFT for virginica")
}

Run the code above in your browser using DataLab