Learn R Programming

DataVisualizations

Table of Contents

1. Introduction
2. Installation
3. Additional Ressources
4. References

1. Introduction

“Exploratory data analysis is detective work” [Tukey, 1977, p.2]. This package enables the user to use graphical tools to find ‘quantitative indications’ enabling a better understanding of the data at hand. “As all detective stories remind us, many of the circumstances surrounding a crime are accidental or misleading. Equally, many of the indications to be discerned in bodies of data are accidental or misleading [Tukey, 1977, p.3].” The solution is to compare many different graphical tools with the goal to find an agreement or to generate an hypothesis and then to confirm it with statistical methods. This package serves as a starting point.

The DataVisualizations package offers various visualization methods and graphical tools for data analysis, including:

  • Synoptic visualizations of data: Synoptic visualization methods such as Pixelmatrices.
  • Distribution analysis and visualization: Visual distribution analysis for one- or higher dimensional data, including MD Plots and PDE (Pareto Density Estimation).
  • Spatial visualizations: Spatial visualizations such as choropleth maps.
  • Visual analysis of Clusters, Correlation, Distances and Projections: Visual analysis of clusters such as Silhouette plots, or visual projection analysis with the Shepard diagrams.
  • Other visualizations: For example ABC-Barplots, Errorplots and more.

Examples of synoptic visualizations:

Get synoptic view of the data, with a pixelmatrix

data("Lsun3D")
Pixelmatrix(Lsun3D$Data)

The Pixelmatrix can be used as a shortcut in visualizing correlations between many variables

n=nrow(Lsun3D$Data)
Data=cbind(Lsun3D$Data,runif(n),rnorm(n),rt(n,2),rlnorm(n),rchisq(100,2))
Header=c('x','y','z','uniform','gauss','t','log-normal','chi')
cc=cor(Data,method='spearman')
diag(cc)=0
Pixelmatrix(cc,YNames = Header,XNames = Header,main = 'Spearman Coeffs')

Examples of distribution analysis:

InspectVariables provides a summary of the most important plots for one dimensional distribution analysis such as histogram, continuous data density estimation, QQ-Plot, and Boxplot:

data(ITS)
InspectVariable(ITS)

The MD Plot can be used for visualizing the densities of several variables, the MD Plot combines the syntax of ggplot2 with the Pareto density estimation and additional functionality useful from the Data Scientist’s point of view:

data(MTY)
Data=cbind(ITS,MTY)
MDplot(Data)+ylim(0,6000)+ggtitle('Two Features with MTY Capped')

Create density scatter plots in 2D:

DensityScatter(ITS, MTY, xlab = 'ITS in EUR', ylab ='MTY in EUR', xlim = c(0,1200), ylim = c(0,15000), main='Pareto Density Estimation indicates Bimodality')

Examples of visual cluster analysis:

The heatmap of the distances, ordered by clusters allows to get a synoptic view over the intra- and intercluster distances. Examples and interpretations of Heatmaps and Silhouette plots are presented in [Thrun 2018A, 2018B].

data("Lsun3D")
Heatmap(Lsun3D$Data,Lsun3D$Cls,method = 'euclidean')

Plot Silhuoette plot of clustering:

Silhouetteplot(Lsun3D$Data,Lsun3D$Cls,PlotIt = T)

InputDistances shows the most important plots of the distribution of distances of the data. The distance distribution in the input space can be bimodal, indicating a distinction between the inter- versus intracluster distances. This can serve as an indication of distance-based cluster structures (see [Thrun, 2018A, 2018B]).

InspectDistances(Lsun3D$Data,method="euclidean")

2. Installation

Installation using CRAN

Install automatically with all dependencies via

install.packages("DataVisualizations",dependencies = T)

Installation using Github

Please note, that dependecies have to be installed manually.

remotes::install_github("Mthrun/DataVisualizations")

Installation using R Studio

Please note, that dependecies have to be installed manually.

Tools -> Install Packages -> Repository (CRAN) -> DataVisualizations

3. Additional Resources

  • View package on CRAN

Tutorial Examples

The tutorial with several examples can be found on in the vignette on CRAN:

Vignette

Manual

The full manual for users or developers is available here: Package documentation

4. References

[Thrun, 2018A] Thrun, M. C.: Projection Based Clustering through Self-Organization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 978-3-658-20539-3, https://doi.org/10.1007/978-3-658-20540-9, 2018.

[Thrun, 2018B] Thrun, M. C.: Cluster Analysis of Per Capita Gross Domestic Products, Entrepreneurial Business and Economics Review (EBER), Vol. 7(1), pp. 217-231, https://doi.org/10.15678/EBER.2019.070113, 2019.

[Thrun/Ultsch, 2018] Thrun, M. C., & Ultsch, A.: Effects of the payout system of income taxes to municipalities in Germany, in Papiez, M. & Smiech,, S. (eds.), Proc. 12th Professor Aleksander Zelias International Conference on Modelling and Forecasting of Socio-Economic Phenomena, pp. 533-542, Cracow: Foundation of the Cracow University of Economics, Cracow, Poland, 2018.

[Thrun et al., 2020] Thrun, M. C., Gehlert, T. & Ultsch, A.: Analyzing the Fine Structure of Distributions, PLoS ONE, Vol. 15(10), pp. 1-66, DOI 10.1371/journal.pone.0238835, 2020.

[Tukey, 1977] Tukey, J. W.: Exploratory data analysis, United States Addison-Wesley Publishing Company, ISBN: 0-201-07616-0, 1977.

Copy Link

Version

Install

install.packages('DataVisualizations')

Monthly Downloads

946

Version

1.3.5

License

GPL-3

Maintainer

Michael Thrun

Last Published

August 24th, 2025

Functions in DataVisualizations (1.3.5)

DataVisualizations-package

tools:::Rd_package_title("DataVisualizations")
DefaultColorSequence

Default color sequence for plots
CombineCols

Combine vectors of various lengths
DensityScatter

Scatter plot with densities
DensityContour

Contour plot of densities
DualaxisLinechart

DualaxisLinechart
InspectBoxplots

Inspect Boxplots
GoogleMapsCoordinates

Google Maps with marked coordinates
ITS

Income Tax Share
DrawWorldWithCls

Plot a classificated world map
Fanplot

The fan plot
DualaxisClassplot

Dualaxis Classplot
FundamentalData_Q1_2018

Fundamental Data of the 1st Quarter in 2018
HeatmapColors

Default color sequence for plots
Heatmap

Heatmap for Clustering
Lsun3D

Lsun3D inspired by FCPS [Thrun/Ultsch, 2020] introduced in [Thrun, 2018]
InspectScatterplots

Pairwise scatterplots and optimal histograms
InspectCorrelation

Inspect the Correlation
InspectDistances

Inspection of Distance-Distribution
MDplot

Mirrored Density plot (MD-plot)
MAplot

Minus versus Add plot
JitterUniqueValues

Jitters Unique Values
InspectVariable

Visualization of Distribution of one variable
InspectStandardization

QQplot of Data versus Normalized Data
OptimalNoBins

Optimal Number Of Bins
Multiplot

Plot multiple ggplots objects in one panel
ParetoRadius_fast

Fast ParetoRadius for distributions
MDplot4multiplevectors

Mirrored Density plot (MD-plot)for Multiple Vectors
PmatrixColormap

P-Matrix colors
QQplot

QQplot with a Linear Fit
ROC

ROC plot
Piechart

The pie chart
MTY

Muncipal Income Tax Yield
PlotProductratio

Product-Ratio Plot
Meanrobust

Robust Empirical Mean Estimation
ParetoRadius

ParetoRadius for distributions
ParetoDensityEstimation

Pareto Density Estimation V3
PlotMissingvalues

Plot of the Amount Of Missing Values
PlotGraph2D

PlotGraph2D
ShepardDensityscatter

Shepard PDE scatter
Sheparddiagram

Draws a Shepard Diagram
SignedLog

Signed Log
Worldmap

plots a world map by country codes
Stdrobust

Standard Deviation Robust
zplot

Plotting for 3 dimensional data
Silhouetteplot

Silhouette plot of classified data.
PDEnormrobust

PDEnormrobust
PDEplot

PDE plot
stat_pde_density

Calculate Pareto density estimation for ggplot2 plots
world_country_polygons

world_country_polygons
estimateDensity2D

estimateDensity2D
RobustNormalization

RobustNormalization
categoricalVariable

A categorical Feature.
RobustNorm_BackTrafo

Transforms the Robust Normalization back
StatPDEdensity

Pareto Density Estimation
Plot3D

3D plot of points
Pixelmatrix

Plot of a Pixel Matrix
Slopechart

Slope Chart
ChoroplethPostalCodesAndAGS_Germany

Postal Codes and AGS of Germany for a Choropleth Map
ClassMDplot

Class MDplot for Data w.r.t. all classes
CCDFplot

plot Complementary Cumulative Distribution Function (CCDF) in Log/Log uses ecdf, CCDF(x) = 1-cdf(x)
ClassBoxplot

Creates Boxplot plot for all classes
AccountingInformation_PrimeStandard_Q3_2019

Accounting Information in the Prime Standard in Q3 in 2019 (AI_PS_Q3_2019)
ClassBarPlot

ClassBarPlot
ClassPDEplot

PDE Plot for all classes
ClassPDEplotMaxLikeli

Create PDE plot for all classes with maximum likelihood
ABCbarplot

Barplot with Sorted Data Colored by ABCanalysis
Choroplethmap

Plots the Choropleth Map
BimodalityAmplitude

Bimodality Amplitude
ClassErrorbar

ClassErrorbar
Classplot

Classplot
CombineRows

Combine matrices of various lengths
Crosstable

Crosstable plot