## Tools for Descriptive Statistics

# Tools for Descriptive Statistics and Exploratory Data Analysis

DescTools is an extensive collection of miscellaneous basic statistics functions and comfort wrappers not available in the R basic system for efficient description of data. The author’s intention was to create a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results. The package contains furthermore functions to produce documents using MS Word (or PowerPoint) and functions to import data from Excel.

A considerable part of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages, which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The ‘CamelStyle’ was consequently applied to functions borrowed from contributed R packages as well.

Feedback, feature requests, bug reports and other suggestions are welcome! Please report problems to to GitHub issues tracker (preferred), Stack Overflow mentioning DescTools or directly to the maintainer.

## Installation

You can install the released version of DescTools from CRAN with:

install.packages("DescTools")


And the development version from GitHub with:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("AndriSignorell/DescTools")


# Warning

Warning: This package is still under development. Although the code seems meanwhile quite stable, until release of version 1.0 you should be aware that everything in the package might be subject to change. Backward compatibility is not yet guaranteed. Functions may be deleted or renamed and new syntax may be inconsistent with earlier versions. By release of version 1.0 the “deprecated-defunct process” will be installed.

# MS-Office

To make use of MS-Office features, you must have Office in one of its variants installed. All Wrd*, XL* and Pp* functions require the package RDCOMClient to be installed as well. Hence the use of these functions is restricted to Windows systems. RDCOMClient can be installed with:

install.packages("RDCOMClient", repos="http://www.omegahat.net/R")


The omegahat repository does not benefit from the same update service as CRAN. So you may be forced to install a package compiled with an earlier version, which usually is not a problem. Use, e.g., for R 3.6x:

url <- "http://www.omegahat.net/R/bin/windows/contrib/3.5.1/RDCOMClient_0.93-0.zip"
install.packages(url, repos = NULL, type = "binary")


RDCOMClient does not exist for Mac or Linux, sorry.

# Authors

Andri Signorell
Helsana Versicherungen AG, Health Sciences, Zurich
HWZ University of Applied Sciences in Business Administration Zurich.

R is a community project. This can be seen from the fact that this package includes R source code and/or documentation previously published by various authors and contributors. Special thanks go to Beat Bruengger, Mathias Frueh, Daniel Wollschlaeger for their valuable contributions and testing. The good things come from all these guys, any problems are likely due to my tweaking. Thank you all!

Maintainer: Andri Signorell

# Examples

library(DescTools)

demo(describe, package = "DescTools")

demo(plots, package = "DescTools")


## Functions in DescTools

 Name Description AllIdentical Test Multiple Objects for Exact Equality AUC Area Under the Curve AddMonths Add a Month to a Date AndersonDarlingTest Anderson-Darling Test of Goodness-of-Fit AllDuplicated Index Vector of All Values Involved in Ties AddMonthsYM Add a Month to a Date in YearMonth Format Abstract Display Compact Abstract of a Data Frame BartelsRankTest Bartels Rank Test of Randomness BarText Add the Value Labels to a Barplot Arrow Insert an Arrow Into a Plot Base Conversions Converts Numbers From Binmode, Octmode or Hexmode to Decimal and Vice Versa Atkinson Atkinson Index - A Measure of Inequality. Abind Combine Multidimensional Arrays Append Append Elements to Objects AscToChar Convert ASCII Codes to Characters and Vice Versa AxisBreak Place a Break Mark on an Axis Agree Raw Simple And Extended Percentage Agreement BarnardTest Barnard's Unconditional Test BoxCoxLambda Automatic Selection of Box Cox Transformation Parameter Benford Benford's Distribution CatTable Function to write a table CartToPol Transform Cartesian to Polar/Spherical Coordinates and Vice Versa CochranQTest Cochran's Q test BinomCI Confidence Intervals for Binomial Proportions Between, Outside Operators To Check, If a Value Lies Within Or Outside a Given Range BrierScore Brier Score for Assessing Prediction Accuracy CoefVar Coefficient of Variation BubbleLegend Add a Legend to a Bubble Plot CochranArmitageTest Cochran-Armitage Test for Trend BreslowDayTest Breslow-Day Test for Homogeneity of the Odds Ratios BoxedText Add Text in a Box to a Plot CohenD Cohen's Effect Size BinomRatioCI Confidence Intervals for the Ratio of Binomial and Multinomial Proportions BinTree Binary Tree Conf Confusion Matrix And Associated Statistics CentralValue Obtain statistic of centrality ConDisPairs Concordant and Discordant Pairs CountCompCases Count Complete Cases Asp Get Aspect Ratio of the Current Plot BinomDiffCI Confidence Interval for a Difference of Binomials CorPolychor Polychoric Correlation Association measures Cramer's V, Pearson's Contingency Coefficient and Phi Coefficient Yule's Q and Y, Tschuprow's T ConnLines Add Connection Lines to a Barplot ConoverTest Conover's Test of Multiple Comparisons Cor Covariance and Correlation (Matrices) Closest Find the Closest Value Clockwise Calculates Begin and End Angle From a List of Given Angles in Clockwise Mode CramerVonMisesTest Cramer-von Mises Test for Normality CorPart Find the Correlations for a Set x of Variables With Set y Removed BreuschGodfreyTest Breusch-Godfrey Test Assocs Association Measures Coalesce Return the First Element Not Being NA ColToGrey Convert Colors to Grey/Grayscale CohenKappa Cohen's Kappa and Weighted Kappa CutQ Create a Factor Variable Using the Quantiles of a Continuous Variable CombPairs Get All Pairs Out of One or Two Sets of Elements Dot Scalar Product ColumnWrap Column Wrap ColorLegend Add a ColorLegend to a Plot Desc Describe Data DenseRank Dense Ranks and Percent Ranks CrossN n-dimensional Vector Cross Product CompleteColumns Find Complete Columns DrawArc Draw Elliptic Arc(s) Datasets for Simulation Datasets for Probabilistic Simulation DegToRad Convert Degrees to Radians and Vice Versa DivCoef Rao's Diversity Coefficient Cstat C Statistic (Area Under the ROC Curve) Date Functions Basic Date Functions DrawBand Draw Confidence Band DivCoefMax Maximal value of Rao's diversity coefficient also called quadratic entropy DoCall Fast Alternative To The Internal do.call DoBy Evaluates a Function Groupwise DigitSum Calculate Digit Sum Contrasts Pairwise Contrasts Divisors Calculate Divisors CCC Concordance Correlation Coefficient BootCI Simple Bootstrap Confidence Intervals Fibonacci Fibonacci Numbers BoxCox Box Cox Transformation DunnTest Dunn's Test of Multiple Comparisons ColToHex Convert a Color or a RGB-color Into Hex String DrawBezier Draw a Bezier Curve ConvUnit Unit Conversion and Metrix Prefixes Canvas Canvas for Geometric Plotting DrawCircle Draw a Circle DrawEllipse Draw an Ellipse FctArgs Retrieve a Function's Arguments FindColor Get Color on a Defined Color Range ExtrVal Distributions of Maxima and Minima Frac Fractional Part and Maximal Digits of a Numeric Value Format Format Numbers and Dates GTest G-Test for Count Data Factorize Prime Factorization of Integers DunnettTest Dunnett's Test for Comparing Several Treatments With a Control CronbachAlpha Cronbach's Coefficient Alpha FindCorr Determine Highly Correlated Variables GenExtrVal The Generalized Extreme Value Distribution Cross Vector Cross Product GetNewXL Create a New Excel Instance ColToRgb Color to RGB Conversion Eps Greenhouse-Geisser And Huynh-Feldt Epsilons CountWorkDays Count Work Days Between Two Dates CollapseTable Collapse Levels of a Table Gumbel The Gumbel Distribution DrawRegPolygon Draw Regular Polygon(s) Gmean Geometric Mean and Standard Deviation Gompertz The Gompertz distribution ColToHsv R Color to HSV Conversion GetNewWrd Create a New Word Instance Dummy Generate Dummy Codes for a Factor GenPareto The Generalized Pareto Distribution CourseData Get HWZ Datasets GeomSn Geometric Series ErrBars Add Error Bars to an Existing Plot DescToolsOptions DescTools Options Hmean Harmonic Mean and Its Confidence Interval JarqueBeraTest (Robust) Jarque Bera Test HotellingsT2Test Hotelling's T2 Test Herfindahl Concentration Measures HuberM Safe (generalized) Huber M-Estimator of Location ICC Intraclass Correlations (ICC1, ICC2, ICC3 From Shrout and Fleiss) JonckheereTerpstraTest Exact Version of Jonckheere-Terpstra Test HosmerLemeshowTest Hosmer-Lemeshow Goodness of Fit Tests HmsToSec Convert h:m:s To/From Seconds DescTools-package Tools for Descriptive Statistics and Exploratory Data Analysis IsDichotomous Test If a Variable Contains Only Two Unique Values IsEuclid Is a Distance Matrix Euclidean? Freq2D Bivariate (Two-Dimensional) Frequency Distribution GCD, LCM Greatest Common Divisor and Least Common Multiple Lc Lorenz Curve LehmacherTest Lehmacher's Test for Marginal Homogenity MoveAvg Moving Average LOF Local Outlier Factor Median (Weighted) Median Value IsValidHwnd Check Windows Pointer LillieTest Lilliefors (Kolmogorov-Smirnov) Test for Normality LeveneTest Levene's Test for Homogeneity of Variance Label, Unit Label, Unit Attribute of an Object OddsRatio Odds Ratio Estimation and Confidence Intervals GoodmanKruskalTau Goodman Kruskal's Tau FisherZ Fisher-Transformation for Correlation to z-Score MultMerge Merge Multiple Data Frames Entropy Shannon Entropy and Mutual Information MedianCI Confidence Interval for the Median GoodmanKruskalGamma Goodman Kruskal's Gamma ExpFreq Expected Frequencies FixToTable Convert a Text to a Table DurbinWatsonTest Durbin-Watson Test ParseFormula Parse a Formula and Create a Model Frame EtaSq Effect Size Calculations for ANOVAs IsDate Check If an Object Is of Type Date Frechet The Frechet Distribution Order Distributions of Order Statistics InDots Is a Specific Argument in the Dots-Arguments? ParseSASDatalines Parse a SAS Dataline Command KendallTauA Kendall's $\tau_{a}$ KappaM Kappa for m Raters Keywords List Keywords For R Manual Pages Lambda Goodman Kruskal Lambda GeomTrans Geometric Transformations LinScale Linear Scaling GetCurrWrd Get a Handle to a Running Word Instance Freq Frequency Table for a Single Variable PlotBubble Draw a Bubble Plot HodgesLehmann Hodges-Lehmann Estimator of Location Some numeric checks Check a Vector For Being Numeric, Zero Or a Whole Number LineToUser Convert Line Coordinates To User Coordinates MeanDiffCI Confidence Interval For Difference of Means PasswordDlg Password Dialog Gini Gini Coefficient Extremes Kth Smallest/Largest Values NPV Short Selection of Financial Mathematical Functions MeanSE Standard Error of Mean Measures of Accuracy Measures of Accuracy ImputeKnn Fill in NA values with the values of the nearest neighbours PMT Periodic Payment of an Annuity. PercTable Percentage Table PlotCandlestick Plot Candlestick Chart IdentifyA Identify Points in Plot Lying Within a Rectangle or Polygon Mar and Mgp Set Plot Margins and Distances MHChisqTest Mantel-Haenszel Chi-Square Test PearsonTest Pearson Chi-Square Test for Normality MultinomCI Confidence Intervals for Multinomial Proportions IQRw The (weighted) Interquartile Range HoeffD Matrix of Hoeffding's D Statistics IsPrime IsPrime Property KrippAlpha Krippendorff's Alpha Reliability Coefficient LOCF Last Observation Carried Forward GiniSimpson Gini-Simpson Coefficient and Hunter-Gaston Index List Variety Of Objects List Objects, Functions Or Data in a Package PlotECDF Empirical Cumulative Distribution Function Permn Number and Samples for Permutations or Combinations of a Set PageTest Exact Page Test for Ordered Alternatives HexToRgb Convert a Hexstring Color to a Matrix With Three Red/Green/Blue Rows PlotCorr Plot a Correlation Matrix Mean (Weighted) Arithmetic Mean PairApply Pairwise Calculations MAD Median Absolute Deviation KendallTauB Kendall's $\tau_{b}$ MixColor Compute the Convex Combination of Two Colors PlotArea Create an Area Plot IsOdd Checks If An Integer Is Even Or Odd HexToCol Identify Closest Match to a Color Given by a Hexadecimal String ORToRelRisk Transform Odds Ratio to Relative Risk DescTools Palettes Some Custom Palettes Midx Find the Midpoints of a Numeric Vector PlotPolar Plot Values on a Circular Grid PlotFdist Frequency Distribution Plot PoissonCI Poisson Confidence Interval NemenyiTest Nemenyi Test PlotFun Plot a Function PlotBag Bivariate Boxplot PlotMultiDens Plot Multiple Density Curves PlotPairs Extended Scatterplot Matrices KendallW Kendall's Coefficient of Concordance W PlotACF Combined Plot of a Time Series and Its ACF and PACF PlotLinesA Plot Lines Primes Find All Primes Less Than n PlotPyramid Draw a Back To Back Pyramid Plot PlotFaces Chernoff Faces RelRisk Relative Risk PolarGrid Plot a Grid in Polar Coordinates PlotLog Logarithmic Plot Phrase Phrasing Results of t-Test PseudoR2 Pseudo R2 Statistics PlotDot Cleveland's Dot Plots LogSt Started Logarithmic Transformation and Its Inverse Recycle Recyle a List of Elements PlotQQ QQ-Plot for Any Distribution ScheffeTest Scheffe Test for Pairwise and Otherwise Comparisons Str Compactly Display the Structure of any R Object StrAbbr String Abbreviation SendOutlookMail Send a Mail Using Outlook as Mail Client PlotViolin Plot Violins Instead of Boxplots RevWeibull The Reverse Weibull Distribution StrPos Find Position of First Occurrence Of a String PostHocTest Post-Hoc Tests TTestA Student's t-Test Based on Sample Statistics PowerPoint Interface Add Slides, Insert Texts and Plots to PowerPoint PlotMonth Cycle Plot for Seasonal Effects of an Univariate Time Series ColToOpaque Equivalent Opaque Color for Transparent Color Logit Generalized Logit and Inverse Logit Function MeanCI Confidence Interval for the Mean TukeyBiweight Calculate Tukey's Biweight Robust Mean PlotWeb Plot a Web of Connected Points RSessionAlive How Long Has the RSession Been Running? StrRev Reverse a String Quot Lagged Quotients TextContrastColor Choose Textcolor Depending on Background Color pRevGumbel "Reverse" Gumbel Distribution Functions MeanAD Mean Absolute Deviation From a Center Point RunsTest Runs Test for Randomness PlotMosaic Mosaic Plots Depreciation Several Methods of Depreciation of an Asset TwoGroups Describe a Variable by a Factor with Two Levels Sample Random Samples and Permutations PlotTernary Ternary or Triangular Plots Rev Reverse Elements of a Vector, a Matrix, a Table, an Array or a Data.frame WoolfTest Woolf Test For Homogeneity in 2x2xk Tables Mode Mode, Most Frequent Value(s) RoundTo Round to Multiple Some Return Some Randomly Chosen Elements of an Object SetAlpha Add an Alpha Channel To a Color Rotate Rotate a Geometric Structure ShapiroFranciaTest Shapiro-Francia Test for Normality Shade Produce a Shaded Curve SomersDelta Somers' Delta XLGetRange Import Data Directly From Excel WrdPlot Insert Active Plot to Word XLSaveAs Save Excel File DescTools Aliases Some Aliases Set for Convenience StrChop Split a String into a Number of Sections of Defined Length WrdSaveAs Open and Save Word Documents Range (Robust) Range Recode Recode a Factor SetNames Set the Names in an Object StrCountW Count Words in a String WrdBookmark Return a Handle to a Word Bookmark Given as Name PDFManual Get PDF Manual of a Package From CRAN Sort Sort a Vector, a Matrix, a Table or a Data.frame Outlier Outlier RomanToInt Convert Roman Numerals to Integers SortMixed Sort Strings with Embedded Numbers Based on Their Numeric Order SD (Weighted) Standard Deviation TMod Comparison Table For Linear Models Measures of Shape Skewness and Kurtosis SplitPath Split Path In Drive, Path, Filename SmoothSpline Formula Interface For smooth.spline StrTrim Remove Leading/Trailing Whitespace From A String SpreadOut Spread Out a Vector of Numbers To a Minimum Interval PlotCashFlow Cash Flow Plot MosesTest Moses Test of Extreme Reactions TextToTable Converts String To a Table TOne Create Table One Describing Baseline Characteristics StrTrunc Truncate Strings and Add Ellipses If a String is Truncated. SaveAs Saves an R Object Under a Different Name SiegelTukeyTest Siegel-Tukey Test For Equality In Variability PlotCirc Plot Circular Plot TheilU Theil's U Index of Inequality %nin% Find Matching (or Non-Matching) Elements ToWrdB Send Objects to Word and Bookmark Them SampleTwins Sample Twins VecRot Vector Rotation (Shift Elements) RobScale Robust Scaling With Median and Mad ToWrd Send Objects to Word SignTest Sign Test UncertCoef Uncertainty Coefficient VarTest ChiSquare Test for One Variance and F Test to Compare Two Variances WrdCaption Insert Caption to Word VonNeumannTest Von Neumann's Successive Difference Test StrAlign String Alignment StrLeft, StrRight Returns the Left Or the Right Part Of a String StrDist Compute Distances Between Strings WrdMergeCells Merges Cells Of a Defined Word Table Range StrCap Capitalize the First Letter of a String Vigenere Vigenere Cypher UnirootAll Finds many (all) roots of one equation within an interval StrExtract Extract Part of a String WrdCellRange Return the Cell Range Of a Word Table WrdInsertBookmark Insert a Bookmark, Goto Bookmark and Update the Text of a Bookmark StuartTauC Stuart $\tau_{c}$ ToWrdPlot Send a Plot to Word and Bookmark it Strata Stratified Sampling StrSpell Spell a String Using the NATO Phonetic or the Morse Alphabet ZTest Z Test for Known Population Standard Deviation ZeroIfNA Replace NAs by 0 PlotMarDens Scatterplot With Marginal Densities SysInfo System Information StrVal Extract All Numeric Values From a String PlotVenn Plot a Venn Diagram PlotMiss Plot Missing Data TitleRect Plot Boxed Annotation d.diamonds Data diamonds PtInPoly Point in Polygon Trim Trim a Vector VIF Variance Inflation Factors PlotTreemap Create a Treemap Quantile Weighted Quantiles %overlaps% Determines If And How Extensively Two Date Ranges Overlap d.periodic Periodic Table of Elements %c% Concatenates Two Strings Without Any Separator XLDateToPOSIXct Convert Excel Dates to POSIXct d.countries ISO 3166-1 Country Codes ToLong, ToWide Reshape a Vector From Long to Wide Shape Or Vice Versa WrdTableBorders Draw Borders to a Word Table Rename Change Names of a Named Object axTicks.POSIXct Compute Axis Tickmark Locations (For POSIXct Axis) Var Variance VarCI Confidence Intervals for the Variance Winsorize Winsorize (Replace Extreme Values by Less Extreme Ones) day.name Build-in Constants Extension VanWaerdenTest van der Waerden Test WithOptions Execute Function with Temporary Options RgbToCol Find the Nearest Named R-Color to a Given RGB-Color reorder.factor Reorder the Levels of a Factor WrdStyle Get or Set the Style in Word identify.formula Identify Points In a Plot Using a Formula as.matrix.xtabs Convert xtabs To matrix WrdParagraphFormat Get or Set the Paragraph Format in Word Zodiac Calculate the Zodiac of a Date Stamp Date/Time/Directory Stamp the Current Plot WrdPageBreak Insert a Page Break d.pizza Data pizza SplitAt Split a Vector Into Several Pieces at Given Positions WrdTable Insert a Table in a Word Document RndPairs Create Pairs of Correlated Random Numbers StripAttr Remove Attributes from an Object wdConst Word VBA Constants d.whisky Classification of Scotch Single Malts StrPad Pad a String With Justification SpearmanRho Spearman Rank Correlation StuartMaxwellTest Stuart-Maxwell Marginal Homogeneity Test StrIsNumeric Does a String Contain Only Numeric Data Untable Recover Original Data From Contingency Table StdCoef Standardized Model Coefficients matpow Matrix Power WrdFormatCells Format Cells Of a Word Table XLView Use MS-Excel as Viewer for a Data.Frame lines.loess Add a Loess or a Spline Smoother lines.lm Add a Linear Regression Line Unwhich Inverse Which YuenTTest Yuen t-Test For Trimmed Means WrdFont Get or Set the Font in Word %like% Like Operator power.chisq.test Power Calculations for ChiSquared Tests split.formula Formula Interface for Split No Results!