Learn R Programming

DataVisualizations (version 1.1.9)

ClassMDplot: Class MDplot for Data w.r.t. all classes

Description

Creates a Mirrored-Density plot w.r.t. to each class of a numerical vector of data.

Usage

ClassMDplot(Data, Cls, 

ColorSequence = DataVisualizations::DefaultColorSequence,

ClassNames = NULL, PlotLegend = TRUE,main = 'MDplot for each Class',

xlab = 'Classes', ylab = 'PDE of Data per Class',

MinimalAmoutOfData=40,MinimalAmoutOfUniqueData=12,SampleSize=1e+05)

Arguments

Data

[1:n] Vector of the data to be plotted

Cls

[1:n] Vector of class identifiers of k clusters one number is the label of one cluster

ColorSequence

Optional: [1:k] vector, The sequence of colors used, Default: DataVisualizations::DefaultColorSequence

ClassNames

Optional: [1:k] vector, The names of the classes. Default: C1 - C(Number of Classes)

PlotLegend

Optional: Add a legent to plot. Default: TRUE)

main

Optional: Title of the plot. Default: MDplot for each Class

xlab

Optional: Title of the x axis. Default: "Classes"

ylab

Optional: Title of the y axis. Default: "Data"

MinimalAmoutOfData

Optional: numeric value defining a threshold. Below this threshold no density estimation is performed and a Jitter plot with a median line is drawn. Please see MDplot for details.

MinimalAmoutOfUniqueData

Optional: numeric value defining a threshold. Below this threshold no density estimation and statistical testing is performed and a Jitter plot is drawn. Only Data Science experts should change this value after they understand how the density is estimated (see [Ultsch, 2005]).

SampleSize

Optional: numeric value defining a threshold. Above this thresholdclass-wise uniform sampling of finite cases is performed in order to shorten computation time. If required, SampleSize=n can be set to omit this procedure.

Value

A List of

ClassData

The DataFrame used to plot with the reordered Cls

ggobject

The ggplot2 plot object

in mode invisible

Details

Further examples for the ClassMDplot can be found in https://md-plot.readthedocs.io/en/latest/application/example_application.html.

The Cls vector is reordered from lowest to highest number. The ClassNames vector and ColorSequence vectors are matched by this ordering of Cls, i.e. the lowest number gets the first color or class name.

References

Thrun, M. C., Gehlert, Tino, & Ultsch, A. : Analyzing the Fine Structure of Distributions, arXiv:1908.06081, 2019.

Thrun, M. C., Pape, F., & Ultsch, A. : Benchmarking Cluster Analysis Methods using PDE-Optimized Violin Plots, Proc. European Conference on Data Analysis (ECDA), Paderborn, Germany, 2018.

Thrun, M. C., Breuer, L., & Ultsch, A. : Knowledge discovery from low-frequency stream nitrate concentrations: hydrology and biology contributions, Proc. European Conference on Data Analysis (ECDA), Paderborn, Germany, 2018.

See Also

https://md-plot.readthedocs.io/en/latest/application/example_application.html

MDplot https://pypi.org/project/md-plot/

Examples

Run this code
# NOT RUN {
# }
# NOT RUN {
data(ITS)
#please download package from cran
#model=AdaptGauss::AdaptGauss(ITS)
#Classification=AdaptGauss::ClassifyByDecisionBoundaries(ITS,

#DecisionBoundaries = AdaptGauss::BayesDecisionBoundaries(model$Means,model$SDs,model$Weights))

DataVisualizations::ClassMDplot(ITS,Classification)
# }

Run the code above in your browser using DataLab