## Tools for Descriptive Statistics

A collection of miscellaneous basic statistic functions and convenience wrappers for efficiently describing data. The author's intention was to create a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results. The package contains furthermore functions to produce documents using MS Word (or PowerPoint) and functions to import data from Excel. Many of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The 'BigCamelCase' style was consequently applied to functions borrowed from contributed R packages as well.

# Tools for Descriptive Statistics and Exploratory Data Analysis

Feedback, feature requests, bug reports and other suggestions are welcome! Please report problems to to GitHub issues tracker (preferred), Stack Overflow mentioning DescTools or directly to the maintainer.

## Installation

You can install the released version of DescTools from CRAN with:

install.packages("DescTools")


And the development version from GitHub with:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("AndriSignorell/DescTools")


# Warning

Warning: This package is still under development. Although the code seems meanwhile quite stable, until release of version 1.0 you should be aware that everything in the package might be subject to change. Backward compatibility is not yet guaranteed. Functions may be deleted or renamed and new syntax may be inconsistent with earlier versions. By release of version 1.0 the “deprecated-defunct process” will be installed.

# MS-Office

To make use of MS-Office features, you must have Office in one of its variants installed. All Wrd*, XL* and Pp* functions require the package RDCOMClient to be installed as well. Hence the use of these functions is restricted to Windows systems. RDCOMClient can be installed with:

install.packages("RDCOMClient", repos="http://www.omegahat.net/R")


The omegahat repository does not benefit from the same update service as CRAN. So you may be forced to install a package compiled with an earlier version, which usually is not a problem. Use e.g. for R 3.6.x/R 4.0:

url <- "http://www.omegahat.net/R/bin/windows/contrib/3.5.1/RDCOMClient_0.93-0.zip"
url <- "http://www.omegahat.net/R/bin/windows/contrib/4.0/RDCOMClient_0.94-0.zip"
install.packages(url, repos = NULL, type = "binary")


RDCOMClient does not exist for Mac or Linux, sorry.

# Authors

Andri Signorell
Helsana Versicherungen AG, Health Sciences, Zurich

R is a community project. This can be seen from the fact that this package includes R source code and/or documentation previously published by various authors and contributors. Special thanks go to Beat Bruengger, Mathias Frueh, Daniel Wollschlaeger for their valuable contributions and testing. The good things come from all these guys, any problems are likely due to my tweaking. Thank you all!

Maintainer: Andri Signorell

# Examples

library(DescTools)

demo(describe, package = "DescTools")

demo(plots, package = "DescTools")


## Functions in DescTools

 Name Description AUC Area Under the Curve BinomRatioCI Confidence Intervals for the Ratio of Binomial and Multinomial Proportions CartToPol Transform Cartesian to Polar/Spherical Coordinates and Vice Versa ColToHsv R Color to HSV Conversion CatTable Function to write a table ColToRgb Color to RGB Conversion CombPairs Get All Pairs Out of One or Two Sets of Elements ColumnWrap Column Wrap CorPolychor Polychoric Correlation CorPart Find the Correlations for a Set x of Variables With Set y Removed Cross Vector Cross Product CrossN n-dimensional Vector Cross Product DivCoefMax Maximal value of Rao's diversity coefficient also called quadratic entropy DivCoef Rao's Diversity Coefficient GetCurrWrd Get a Handle to a Running Word Instance FindCorr Determine Highly Correlated Variables FisherZ Fisher-Transformation for Correlation to z-Score GetNewWrd Create a New Word Instance Arrow Insert an Arrow Into a Plot Abstract Display Compact Abstract of a Data Frame ColToHex Convert a Color or a RGB-color Into Hex String AddMonths Add a Month to a Date AscToChar Convert ASCII Codes to Characters and Vice Versa GoodmanKruskalTau Goodman Kruskal's Tau ColToGrey Convert Colors to Grey/Grayscale Asp Get Aspect Ratio of the Current Plot Assocs Association Measures BoxCoxLambda Automatic Selection of Box Cox Transformation Parameter BoxedText Add Text in a Box to a Plot ConDisPairs Concordant and Discordant Pairs ConvUnit Unit Conversion and Metrix Prefixes BrierScore Brier Score for Assessing Prediction Accuracy BubbleLegend Add a Legend to a Bubble Plot CompleteColumns Find Complete Columns Cor Covariance and Correlation (Matrices) AllDuplicated Index Vector of All Values Involved in Ties Cstat C Statistic (Area Under the ROC Curve) Conf Confusion Matrix And Associated Statistics ConnLines Add Connection Lines to a Barplot AllIdentical Test Multiple Objects for Exact Equality CourseData Get HWZ Datasets Gumbel The Gumbel Distribution Benford Benford's Distribution AndersonDarlingTest Anderson-Darling Test of Goodness-of-Fit Association measures Cramer's V, Pearson's Contingency Coefficient and Phi Coefficient Yule's Q and Y, Tschuprow's T DrawArc Draw Elliptic Arc(s) Desc Describe Data DrawBand Draw Confidence Band DescTools-package Tools for Descriptive Statistics and Exploratory Data Analysis Append Append Elements to Objects BarText Add the Value Labels to a Barplot Dummy Generate Dummy Codes for a Factor Between, Outside Operators To Check, If a Value Lies Within Or Outside a Given Range DunnTest Dunn's Test of Multiple Comparisons Clockwise Calculates Begin and End Angle From a List of Given Angles in Clockwise Mode Closest Find the Closest Value BarnardTest Barnard's Unconditional Test IdentifyA Identify Points in Plot Lying Within a Rectangle or Polygon InDots Is a Specific Argument in the Dots-Arguments? DoCall Fast Alternative To The Internal do.call ExpFreq Expected Frequencies CutQ Create a Factor Variable Using the Quantiles of a Continuous Variable CochranQTest Cochran's Q test Dot Scalar Product ExtrVal Distributions of Maxima and Minima Entropy Shannon Entropy and Mutual Information CoefVar Coefficient of Variation DegToRad Convert Degrees to Radians and Vice Versa Frac Fractional Part and Maximal Digits of a Numeric Value BootCI Simple Bootstrap Confidence Intervals Eps Greenhouse-Geisser And Huynh-Feldt Epsilons Frechet The Frechet Distribution GenExtrVal The Generalized Extreme Value Distribution BoxCox Box Cox Transformation CCC Concordance Correlation Coefficient GiniSimpson Gini-Simpson Coefficient and Hunter-Gaston Index Gmean Geometric Mean and Standard Deviation GeomSn Geometric Series Coalesce Return the First Element Not Being NA Canvas Canvas for Geometric Plotting GenPareto The Generalized Pareto Distribution JonckheereTerpstraTest Exact Version of Jonckheere-Terpstra Test GeomTrans Geometric Transformations CochranArmitageTest Cochran-Armitage Test for Trend ColorLegend Add a ColorLegend to a Plot CramerVonMisesTest Cramer-von Mises Test for Normality CollapseTable Collapse Levels of a Table KappaM Kappa for m Raters AddMonthsYM Add a Month to a Date in YearMonth Format Agree Raw Simple And Extended Percentage Agreement CronbachAlpha Cronbach's Coefficient Alpha Herfindahl Concentration Measures HexToCol Identify Closest Match to a Color Given by a Hexadecimal String ICC Intraclass Correlations (ICC1, ICC2, ICC3 From Shrout and Fleiss) IQRw The (weighted) Interquartile Range DescToolsOptions DescTools Options DunnettTest Dunnett's Test for Comparing Several Treatments With a Control DenseRank Dense Ranks and Percent Ranks Datasets for Simulation Datasets for Probabilistic Simulation DigitSum Calculate Digit Sum DurbinWatsonTest Durbin-Watson Test Date Functions Basic Date Functions Atkinson Atkinson Index - A Measure of Inequality. ErrBars Add Error Bars to an Existing Plot AxisBreak Place a Break Mark on an Axis Freq Frequency Table for a Single Variable EtaSq Effect Size Calculations for ANOVAs GetNewXL Create a New Excel Instance Freq2D Bivariate (Two-Dimensional) Frequency Distribution BinTree Binary Tree Gini Gini Coefficient HmsToSec Convert h:m:s To/From Seconds BinomCI Confidence Intervals for Binomial Proportions BreuschGodfreyTest Breusch-Godfrey Test BreslowDayTest Breslow-Day Test for Homogeneity of the Odds Ratios HodgesLehmann Hodges-Lehmann Estimator of Location CohenD Cohen's Effect Size DrawBezier Draw a Bezier Curve ConoverTest Conover's Test of Multiple Comparisons CohenKappa Cohen's Kappa and Weighted Kappa Fibonacci Fibonacci Numbers DrawCircle Draw a Circle Contrasts Pairwise Contrasts FindColor Get Color on a Defined Color Range FixToTable Convert a Text to a Table Format Format Numbers and Dates CountCompCases Count Complete Cases Some numeric checks Check a Vector For Being Numeric, Zero Or a Whole Number JarqueBeraTest (Robust) Jarque Bera Test CountWorkDays Count Work Days Between Two Dates DoBy Evaluates a Function Groupwise Divisors Calculate Divisors IsEuclid Is a Distance Matrix Euclidean? IsOdd Checks If An Integer Is Even Or Odd LehmacherTest Lehmacher's Test for Marginal Homogenity DrawEllipse Draw an Ellipse DrawRegPolygon Draw Regular Polygon(s) FctArgs Retrieve a Function's Arguments Factorize Prime Factorization of Integers Extremes Kth Smallest/Largest Values LeveneTest Levene's Test for Homogeneity of Variance Lc Lorenz Curve KendallW Kendall's Coefficient of Concordance W GCD, LCM Greatest Common Divisor and Least Common Multiple MixColor Compute the Convex Combination of Two Colors Keywords List Keywords For R Manual Pages GTest G-Test for Count Data MosesTest Moses Test of Extreme Reactions HexToRgb Convert a Hexstring Color to a Matrix With Three Red/Green/Blue Rows Mode Mode, Most Frequent Value(s) Hmean Harmonic Mean and Its Confidence Interval Gompertz The Gompertz distribution GoodmanKruskalGamma Goodman Kruskal's Gamma LOCF Last Observation Carried Forward MoveAvg Moving Average LOF Local Outlier Factor HoeffD Matrix of Hoeffding's D Statistics HotellingsT2Test Hotelling's T2 Test PlotBubble Draw a Bubble Plot ImputeKnn Fill in NA values with the values of the nearest neighbours HuberM Safe (generalized) Huber M-Estimator of Location PlotBag Bivariate Boxplot PtInPoly Point in Polygon PlotTreemap Create a Treemap PlotVenn Plot a Venn Diagram HosmerLemeshowTest Hosmer-Lemeshow Goodness of Fit Tests Logit Generalized Logit and Inverse Logit Function Quantile Weighted Quantiles MAD Median Absolute Deviation Measures of Accuracy Measures of Accuracy KrippAlpha Krippendorff's Alpha Reliability Coefficient List Variety Of Objects List Objects, Functions Or Data in a Package Rename Change Names of a Named Object IsDate Check If an Object Is of Type Date reorder.factor Reorder the Levels of a Factor NemenyiTest Nemenyi Test NPV Short Selection of Financial Mathematical Functions ParseSASDatalines Parse a SAS Dataline Command IsDichotomous Test If a Variable Contains Only Two Unique Values PasswordDlg Password Dialog PlotCandlestick Plot Candlestick Chart PlotCashFlow Cash Flow Plot MeanSE Standard Error of Mean PlotECDF Empirical Cumulative Distribution Function LineToUser Convert Line Coordinates To User Coordinates PlotFaces Chernoff Faces RgbToCmy Conversion Between RGB and CMYK KendallTauB Kendall's $\tau_{b}$ KendallTauA Kendall's $\tau_{a}$ StdCoef Standardized Model Coefficients RgbToCol Find the Nearest Named R-Color to a Given RGB-Color Stamp Date/Time/Directory Stamp the Current Plot PearsonTest Pearson Chi-Square Test for Normality LogSt Started Logarithmic Transformation and Its Inverse PlotACF Combined Plot of a Time Series and Its ACF and PACF PlotFdist Frequency Distribution Plot PercTable Percentage Table PlotArea Create an Area Plot PlotFun Plot a Function IsPrime IsPrime Property Median (Weighted) Median Value Primes Find All Primes Less Than n IsValidHwnd Check Windows Pointer StrLeft, StrRight Returns the Left Or the Right Part Of a String MedianCI Confidence Interval for the Median Order Distributions of Order Statistics Midx Find the Midpoints of a Numeric Vector Outlier Outlier PlotMonth Cycle Plot for Seasonal Effects of an Univariate Time Series StrSpell Spell a String Using the NATO Phonetic or the Morse Alphabet StrTrunc Truncate Strings and Add Ellipses If a String is Truncated. PseudoR2 Pseudo R2 Statistics Recycle Recyle a List of Elements RelRisk Relative Risk PostHocTest Post-Hoc Tests PlotMosaic Mosaic Plots PDFManual Get PDF Manual of a Package From CRAN StrVal Extract All Numeric Values From a String Rotate Rotate a Geometric Structure RomanToInt Convert Roman Numerals to Integers SetNames Set the Names in an Object PowerPoint Interface Add Slides, Insert Texts and Plots to PowerPoint Rev Reverse Elements of a Vector, a Matrix, a Table, an Array or a Data.frame PMT Periodic Payment of an Annuity. Permn Number and Samples for Permutations or Combinations of a Set PlotLinesA Plot Lines Shade Produce a Shaded Curve Phrase Phrasing Results of t-Test SmoothSpline Formula Interface For smooth.spline TTestA Student's t-Test Based on Sample Statistics pRevGumbel "Reverse" Gumbel Distribution Functions TOne Create Table One Describing Baseline Characteristics RndPairs Create Pairs of Correlated Random Numbers PlotLog Logarithmic Plot Depreciation Several Methods of Depreciation of an Asset Some Return Some Randomly Chosen Elements of an Object SD (Weighted) Standard Deviation RobScale Robust Scaling With Median and Mad SplitAt Split a Vector Into Several Pieces at Given Positions SplitPath Split Path In Drive, Path, Filename VIF Variance Inflation Factors Unwhich Inverse Which StrChop Split a String into a Number of Sections of Defined Length StripAttr Remove Attributes from an Object Strata Stratified Sampling StrCountW Count Words in a String VonNeumannTest Von Neumann's Successive Difference Test SomersDelta Somers' Delta Lambda Goodman Kruskal Lambda Sort Sort a Vector, a Matrix, a Table or a Data.frame PlotPairs Extended Scatterplot Matrices PlotMultiDens Plot Multiple Density Curves Label, Unit Label, Unit Attribute of an Object PoissonCI Poisson Confidence Interval Str Compactly Display the Structure of any R Object Winsorize Winsorize (Replace Extreme Values by Less Extreme Ones) TextToTable Converts String To a Table TwoGroups Describe a Variable by a Factor with Two Levels TextContrastColor Choose Textcolor Depending on Background Color StrSplit Split the Elements of a Character Vector StrAbbr String Abbreviation TheilU Theil's U Index of Inequality UncertCoef Uncertainty Coefficient StrTrim Remove Leading/Trailing Whitespace From A String RoundTo Round to Multiple PolarGrid Plot a Grid in Polar Coordinates RunsTest Runs Test for Randomness WrdPageBreak Insert a Page Break ScheffeTest Scheffe Test for Pairwise and Otherwise Comparisons SaveAs Saves an R Object Under a Different Name WrdParagraphFormat Get or Set the Paragraph Format in Word WithOptions Execute Function with Temporary Options MHChisqTest Mantel-Haenszel Chi-Square Test WoolfTest Woolf Test For Homogeneity in 2x2xk Tables ShapiroFranciaTest Shapiro-Francia Test for Normality XLGetRange Import Data Directly From Excel TitleRect Plot Boxed Annotation WrdCellRange Return the Cell Range Of a Word Table Mar and Mgp Set Plot Margins and Distances WrdFont Get or Set the Font in Word SiegelTukeyTest Siegel-Tukey Test For Equality In Variability Zodiac Calculate the Zodiac of a Date MultinomCI Confidence Intervals for Multinomial Proportions XLSaveAs Save Excel File VecRot Vector Rotation (Shift Elements) StrDist Compute Distances Between Strings MeanCI Confidence Interval for the Mean Vigenere Vigenere Cypher wdConst Word VBA Constants WrdPlot Insert Active Plot to Word WrdSaveAs Open and Save Word Documents SysInfo System Information ToLong, ToWide Reshape a Vector From Long to Wide Shape Or Vice Versa TMod Comparison Table For Linear Models LinScale Linear Scaling ToWrd Send Objects to Word LillieTest Lilliefors (Kolmogorov-Smirnov) Test for Normality %like% Like Operator Trim Trim a Vector Mean (Weighted) Arithmetic Mean as.matrix.xtabs Convert xtabs To matrix lines.lm Add a Linear Regression Line PlotQQ QQ-Plot for Any Distribution StrExtract Extract Part of a String MeanAD Mean Absolute Deviation From a Center Point TukeyBiweight Calculate Tukey's Biweight Robust Mean MeanDiffCI Confidence Interval For Difference of Means ZTest Z Test for Known Population Standard Deviation day.name Build-in Constants Extension ORToRelRisk Transform Odds Ratio to Relative Risk SplitToCol Split Data Frame String Column Into Multiple Columns ZeroIfNA Replace NAs by 0 identify.formula Identify Points In a Plot Using a Formula OddsRatio Odds Ratio Estimation and Confidence Intervals power.chisq.test Power Calculations for ChiSquared Tests WrdTableBorders Draw Borders to a Word Table DescTools Aliases Some Aliases Set for Convenience DescTools Palettes Some Custom Palettes PlotCorr Plot a Correlation Matrix ParseFormula Parse a Formula and Create a Model Frame PlotDot Cleveland's Dot Plots d.whisky Classification of Scotch Single Malts XLDateToPOSIXct Convert Excel Dates to POSIXct PlotViolin Plot Violins Instead of Boxplots d.pizza Data pizza PlotPolar Plot Values on a Circular Grid PlotPyramid Draw a Back To Back Pyramid Plot split.formula Formula Interface for Split %nin% Find Matching (or Non-Matching) Elements PlotWeb Plot a Web of Connected Points MultMerge Merge Multiple Data Frames PlotMarDens Scatterplot With Marginal Densities Range (Robust) Range WrdFormatCells Format Cells Of a Word Table Recode Recode a Factor SendOutlookMail Send a Mail Using Outlook as Mail Client SetAlpha Add an Alpha Channel To a Color PageTest Exact Page Test for Ordered Alternatives SortMixed Sort Strings with Embedded Numbers Based on Their Numeric Order Sample Random Samples and Permutations PlotConDens Plot Conditional Densities PairApply Pairwise Calculations PlotTernary Ternary or Triangular Plots PlotCirc Plot Circular Plot SpearmanRho Spearman Rank Correlation RevWeibull The Reverse Weibull Distribution SignTest Sign Test PlotMiss Plot Missing Data StuartTauC Stuart $\tau_{c}$ StrAlign String Alignment Quot Lagged Quotients ToWrdB Send Objects to Word and Bookmark Them StrIsNumeric Does a String Contain Only Numeric Data ToWrdPlot Send a Plot to Word and Bookmark it UnirootAll Finds many (all) roots of one equation within an interval StrCap Capitalize the First Letter of a String StrPad Pad a String With Justification Untable Recover Original Data From Contingency Table RSessionAlive How Long Has the RSession Been Running? SpreadOut Spread Out a Vector of Numbers To a Minimum Interval WrdBookmark Some Functions to Handle MS-Word Bookmarks ColToOpaque Equivalent Opaque Color for Transparent Color VarCI Confidence Intervals for the Variance WrdCaption Insert Caption to Word SampleTwins Sample Twins Measures of Shape Skewness and Kurtosis WrdTable Insert a Table in a Word Document WrdStyle Get or Set the Style in Word XLView Use MS-Excel as Viewer for a Data.Frame StrPos Find Position of First Occurrence Of a String YuenTTest Yuen t-Test For Trimmed Means axTicks.POSIXct Compute Axis Tickmark Locations (For POSIXct Axis) d.countries ISO 3166-1 Country Codes StrRev Reverse a String StuartMaxwellTest Stuart-Maxwell Marginal Homogeneity Test VanWaerdenTest van der Waerden Test Var Variance %overlaps% Determines If And How Extensively Two Date Ranges Overlap d.diamonds Data diamonds d.periodic Periodic Table of Elements VarTest ChiSquare Test for One Variance and F Test to Compare Two Variances WrdMergeCells Merges Cells Of a Defined Word Table Range %c% Concatenates Two Strings Without Any Separator lines.loess Add a Loess or a Spline Smoother matpow Matrix Power Abind Combine Multidimensional Arrays BartelsRankTest Bartels Rank Test of Randomness Base Conversions Converts Numbers From Binmode, Octmode or Hexmode to Decimal and Vice Versa BinomDiffCI Confidence Interval for a Difference of Binomials No Results!