collapse provides the following functions to efficiently summarize and examine data:
qsu, shorthand for quick-summary, is an extremely fast summary command inspired by the (xt)summarize command in the STATA statistical software. It computes a set of 7 statistics (nobs, mean, sd, min, max, skewness and kurtosis) using a numerically stable one-pass method. Statistics can be computed weighted, by groups, and also within-and between entities (for multilevel / panel data).
descr computes a concise and detailed description of a data frame, including frequency tables for categorical variables and various statistics and quantiles for numeric variables. It is inspired by Hmisc::describe, but about 10x faster.
pwcor, pwcov and pwnobs compute (weighted) pairwise correlations, covariances and observation counts on matrices and data frames. Pairwise correlations and covariances can be computed together with observation counts and p-values, and output as 3D array (default) or list of matrices. A major feature of pwcor and pwcov is the print method displaying all of these statistics in a single correlation table.
varying very efficiently checks for the presence of any variation in data (optionally) within groups (such as panel-identifiers).
| Function / S3 Generic | Methods | Description | ||
qsu |
default, matrix, data.frame, pseries, pdata.frame |
Fast (grouped, weighted, panel-decomposed) summary statistics | ||
descr |
No methods, for data frames or lists of vectors | Detailed statistical description of data frame | ||
pwcor |
No methods, for matrices or data frames | Pairwise correlations | ||
pwcov |
No methods, for matrices or data frames | Pairwise covariances | ||
pwnobs |
No methods, for matrices or data frames | Pairwise observation counts | ||
varying |
default, matrix, data.frame, pseries, pdata.frame, grouped_df |
Fast variation check |