Learn R Programming

DataVisualizations

Table of Contents

1. Introduction
2. Installation
3. Additional Ressources
4. References

1. Introduction

“Exploratory data analysis is detective work” [Tukey, 1977, p.2]. This package enables the user to use graphical tools to find ‘quantitative indications’ enabling a better understanding of the data at hand. “As all detective stories remind us, many of the circumstances surrounding a crime are accidental or misleading. Equally, many of the indications to be discerned in bodies of data are accidental or misleading [Tukey, 1977, p.3].” The solution is to compare many different graphical tools with the goal to find an agreement or to generate an hypothesis and then to confirm it with statistical methods. This package serves as a starting point.

The DataVisualizations package offers various visualization methods and graphical tools for data analysis, including:

  • Synoptic visualizations of data: Synoptic visualization methods such as Pixelmatrices.
  • Distribution analysis and visualization: Visual distribution analysis for one- or higher dimensional data, including MD Plots and PDE (Pareto Density Estimation).
  • Spatial visualizations: Spatial visualizations such as choropleth maps.
  • Visual analysis of Clusters, Correlation, Distances and Projections: Visual analysis of clusters such as Silhouette plots, or visual projection analysis with the Shepard diagrams.
  • Other visualizations: For example ABC-Barplots, Errorplots and more.

Examples of synoptic visualizations:

Get synoptic view of the data, with a pixelmatrix

data("Lsun3D")
Pixelmatrix(Lsun3D$Data)

The Pixelmatrix can be used as a shortcut in visualizing correlations between many variables

n=nrow(Lsun3D$Data)
Data=cbind(Lsun3D$Data,runif(n),rnorm(n),rt(n,2),rlnorm(n),rchisq(100,2))
Header=c('x','y','z','uniform','gauss','t','log-normal','chi')
cc=cor(Data,method='spearman')
diag(cc)=0
Pixelmatrix(cc,YNames = Header,XNames = Header,main = 'Spearman Coeffs')

Examples of distribution analysis:

InspectVariables provides a summary of the most important plots for one dimensional distribution analysis such as histogram, continuous data density estimation, QQ-Plot, and Boxplot:

data(ITS)
InspectVariable(ITS)

The MD Plot can be used for visualizing the densities of several variables, the MD Plot combines the syntax of ggplot2 with the Pareto density estimation and additional functionality useful from the Data Scientist’s point of view:

data(MTY)
Data=cbind(ITS,MTY)
MDplot(Data)+ylim(0,6000)+ggtitle('Two Features with MTY Capped')

Create density scatter plots in 2D:

DensityScatter(ITS, MTY, xlab = 'ITS in EUR', ylab ='MTY in EUR', xlim = c(0,1200), ylim = c(0,15000), main='Pareto Density Estimation indicates Bimodality')

Examples of visual cluster analysis:

The heatmap of the distances, ordered by clusters allows to get a synoptic view over the intra- and intercluster distances. Examples and interpretations of Heatmaps and Silhouette plots are presented in [Thrun 2018A, 2018B].

data("Lsun3D")
Heatmap(Lsun3D$Data,Lsun3D$Cls,method = 'euclidean')

Plot Silhuoette plot of clustering:

Silhouetteplot(Lsun3D$Data,Lsun3D$Cls,PlotIt = T)

InputDistances shows the most important plots of the distribution of distances of the data. The distance distribution in the input space can be bimodal, indicating a distinction between the inter- versus intracluster distances. This can serve as an indication of distance-based cluster structures (see [Thrun, 2018A, 2018B]).

InspectDistances(Lsun3D$Data,method="euclidean")

2. Installation

Installation using CRAN

Install automatically with all dependencies via

install.packages("DataVisualizations",dependencies = T)

Installation using Github

Please note, that dependecies have to be installed manually.

remotes::install_github("Mthrun/DataVisualizations")

Installation using R Studio

Please note, that dependecies have to be installed manually.

Tools -> Install Packages -> Repository (CRAN) -> DataVisualizations

3. Additional Resources

  • View package on CRAN

Tutorial Examples

The tutorial with several examples can be found on in the vignette on CRAN:

Vignette

Manual

The full manual for users or developers is available here: Package documentation

4. References

[Thrun, 2018A] Thrun, M. C.: Projection Based Clustering through Self-Organization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 978-3-658-20539-3, https://doi.org/10.1007/978-3-658-20540-9, 2018.

[Thrun, 2018B] Thrun, M. C.: Cluster Analysis of Per Capita Gross Domestic Products, Entrepreneurial Business and Economics Review (EBER), Vol. 7(1), pp. 217-231, https://doi.org/10.15678/EBER.2019.070113, 2019.

[Thrun/Ultsch, 2018] Thrun, M. C., & Ultsch, A.: Effects of the payout system of income taxes to municipalities in Germany, in Papiez, M. & Smiech,, S. (eds.), Proc. 12th Professor Aleksander Zelias International Conference on Modelling and Forecasting of Socio-Economic Phenomena, pp. 533-542, Cracow: Foundation of the Cracow University of Economics, Cracow, Poland, 2018.

[Thrun et al., 2020] Thrun, M. C., Gehlert, T. & Ultsch, A.: Analyzing the Fine Structure of Distributions, PLoS ONE, Vol. 15(10), pp. 1-66, DOI 10.1371/journal.pone.0238835, 2020.

[Tukey, 1977] Tukey, J. W.: Exploratory data analysis, United States Addison-Wesley Publishing Company, ISBN: 0-201-07616-0, 1977.

Copy Link

Version

Install

install.packages('DataVisualizations')

Monthly Downloads

1,156

Version

1.4.0

License

GPL-3

Maintainer

Michael Thrun

Last Published

October 31st, 2025

Functions in DataVisualizations (1.4.0)

DensityScatter

Scatter plot with densities
DefaultColorSequence

Default color sequence for plots
DensityContour

Contour plot of densities
DrawWorldWithCls

Plot a classificated world map
DualaxisClassplot

Dualaxis Classplot
DataVisualizations-package

tools:::Rd_package_title("DataVisualizations")
Crosstable

Crosstable plot
Classplot

Classplot
CombineRows

Combine matrices of various lengths
CombineCols

Combine vectors of various lengths
HeatmapColors

Default color sequence for plots
FundamentalData_Q1_2018

Fundamental Data of the 1st Quarter in 2018
GoogleMapsCoordinates

Google Maps with marked coordinates
InspectCorrelation

Inspect the Correlation
Fanplot

The fan plot
DualaxisLinechart

DualaxisLinechart
InspectBoxplots

Inspect Boxplots
MDstrips

High-dimensional Density Strips based on Pareto Density Estimation
ITS

Income Tax Share
InspectDistances

Inspection of Distance-Distribution
Heatmap

Heatmap for Clustering
MDplot4multiplevectors

Mirrored Density plot (MD-plot)for Multiple Vectors
Lsun3D

Lsun3D inspired by FCPS [Thrun/Ultsch, 2020] introduced in [Thrun, 2018]
MTY

Muncipal Income Tax Yield
MDplot

Mirrored Density plot (MD-plot)
JitterUniqueValues

Jitters Unique Values
InspectStandardization

QQplot of Data versus Normalized Data
InspectScatterplots

Pairwise scatterplots and optimal histograms
InspectVariable

Visualization of Distribution of one variable
MAplot

Minus versus Add plot
ParetoDensityEstimation

Pareto Density Estimation V3
PDEstrip

1D Density Strip based on Pareto Density Estimation (PDE)
OptimalNoBins

Optimal Number Of Bins
OpposingViolinBiclassPlot

OpposingViolinBiclassPlot
PDEnormrobust

PDEnormrobust
ParetoRadius_fast

Fast ParetoRadius for distributions
PDEplot

PDE plot
ParetoRadius

ParetoRadius for distributions
Multiplot

Plot multiple ggplots objects in one panel
Meanrobust

Robust Empirical Mean Estimation
Plot3D

3D plot of points
PlotMissingvalues

Plot of the Amount Of Missing Values
ROC

ROC plot
PmatrixColormap

P-Matrix colors
PlotGraph2D

PlotGraph2D
QQplot

QQplot with a Linear Fit
Piechart

The pie chart
PlotProductratio

Product-Ratio Plot
Pixelmatrix

Plot of a Pixel Matrix
RobustNorm_BackTrafo

Transforms the Robust Normalization back
Worldmap

plots a world map by country codes
categoricalVariable

A categorical Feature.
RobustNormalization

RobustNormalization
SignedLog

Signed Log
ShepardDensityscatter

Shepard PDE scatter
Sheparddiagram

Draws a Shepard Diagram
Slopechart

Slope Chart
Stdrobust

Standard Deviation Robust
StatPDEdensity

Pareto Density Estimation
Silhouetteplot

Silhouette plot of classified data.
zplot

Plotting for 3 dimensional data
world_country_polygons

world_country_polygons
estimateDensity2D

estimateDensity2D
stat_pde_density

Calculate Pareto density estimation for ggplot2 plots
ClassPDEplot

PDE Plot for all classes
AccountingInformation_PrimeStandard_Q3_2019

Accounting Information in the Prime Standard in Q3 in 2019 (AI_PS_Q3_2019)
CCDFplot

plot Complementary Cumulative Distribution Function (CCDF) in Log/Log uses ecdf, CCDF(x) = 1-cdf(x)
ClassPDEplotMaxLikeli

Create PDE plot for all classes with maximum likelihood
ClassMDplot

Class MDplot for Data w.r.t. all classes
ClassBoxplot

Creates Boxplot plot for all classes
ClassBarPlot

ClassBarPlot
ClassErrorbar

ClassErrorbar
ABCbarplot

Barplot with Sorted Data Colored by ABCanalysis
BimodalityAmplitude

Bimodality Amplitude