Free Access Week - Data Engineering + BI
Data Engineering and BI courses are free this week!
Free Access Week - Jun 2-8

⚠️There's a newer version (2.1.1) of this package.Take me there.

collapse (version 1.8.6)

Advanced and Fast Data Transformation

Description

A C/C++ based package for advanced data transformation and statistical computing in R that is extremely fast, class-agnostic, and programmer friendly through a flexible and parsimonious syntax. It is well integrated with base R, 'dplyr' / (grouped) 'tibble', 'data.table', 'sf', 'plm' (panel-series and data frames), and non-destructively handles other matrix or data frame based classes (like 'ts', 'xts' / 'zoo', 'tsibble', ...) --- Key Features: --- (1) Advanced statistical programming: A full set of fast statistical functions supporting grouped and weighted computations on vectors, matrices and data frames. Fast and programmable grouping, ordering, unique values/rows, factor generation and interactions. Fast and flexible functions for data manipulation, data object conversions, and memory efficient R programming. (2) Advanced aggregation: Fast and easy multi-data-type, multi-function, weighted and parallelized data aggregation. (3) Advanced transformations: Fast row/column arithmetic, (grouped) replacing and sweeping out of statistics (by reference), (grouped, weighted) scaling/standardizing, (higher-dimensional) between (averaging) and (quasi-)within (demeaning) transformations, linear prediction, model fitting and testing exclusion restrictions. (4) Advanced time-computations: Fast and flexible indexed time series and panel data classes. Fast (sequences of) lags/leads, and (lagged/leaded, iterated, quasi-, log-) differences and (compounded) growth rates on (irregular) time series and panels. Multivariate auto-, partial- and cross-correlation functions for panel data. Panel data to (ts-)array conversions. (5) List processing: Recursive list search, splitting, extraction/subsetting, apply, and generalized row-binding / unlisting to data frame. (6) Advanced data exploration: Fast (grouped, weighted, panel-decomposed) summary statistics and descriptive tools.

Copy Link

Version

Install

install.packages('collapse')

Monthly Downloads

37,328

Version

1.8.6

License

GPL (>= 2) | file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Sebastian Krantz

Last Published

June 14th, 2022

Functions in collapse (1.8.6)

GGDC10S

Groningen Growth and Development Centre 10-Sector Database
TRA

Transform Data by (Grouped) Replacing or Sweeping out Statistics
across

Apply Functions Across Multiple Columns
GRP

Fast Grouping / collapse Grouping Objects
collap

Advanced Data Aggregation
BY

Split-Apply-Combine Computing
collapse-options

collapse Package Options
arithmetic

Fast Row/Column Arithmetic for Matrix-Like Objects
collapse-documentation

Collapse Documentation & Overview
collapse-package

Advanced and Fast Data Transformation
dapply

Data Apply
get_elem

Find and Extract / Subset List Elements
fFtest

Fast (Weighted) F-test for Linear Models (with Factors)
fast-data-manipulation

Fast Data Manipulation
fast-grouping-ordering

Fast Grouping and Ordering
collapse-renamed

Renamed Functions
colorder

Fast Reordering of Data Frame Columns
fdroplevels

Fast Removal of Unused Factor Levels
descr

Detailed Statistical Description of Data Frame
efficient-programming

Small Functions to Make R Programming More Efficient
fast-statistical-functions

Fast (Grouped, Weighted) Statistical Functions for Matrix-Like Objects
fgrowth

Fast Growth Rates for Time Series and Panel Data
flag

Fast Lags and Leads for Time Series and Panel Data
fbetween-fwithin

Fast Between (Averaging) and (Quasi-)Within (Centering) Transformations
flm

Fast (Weighted) Linear Model Fitting
frename

Fast Renaming and Relabelling Objects
fmin-fmax

Fast (Grouped) Maxima and Minima for Matrix-Like Objects
fscale

Fast (Grouped, Weighted) Scaling and Centering of Matrix-like Objects
fhdbetween-fhdwithin

Higher-Dimensional Centering and Linear Prediction
fmode

Fast (Grouped, Weighted) Statistical Mode for Matrix-Like Objects
data-transformations

Data Transformations
fcumsum

Fast (Grouped, Ordered) Cumulative Sum for Matrix-Like Objects
fdiff

Fast (Quasi-, Log-) Differences for Time Series and Panel Data
fndistinct

Fast (Grouped) Distinct Value Count for Matrix-Like Objects
ffirst-flast

Fast (Grouped) First and Last Value for Matrix-Like Objects
fnth

Fast (Grouped, Weighted) N'th Element/Quantile for Matrix-Like Objects
fprod

Fast (Grouped, Weighted) Product for Matrix-Like Objects
group

Fast Hash-Based Grouping
groupid

Generate Run-Length Type Group-Id
list-processing

List Processing
fvar-fsd

Fast (Grouped, Weighted) Variance and Standard Deviation for Matrix-Like Objects
fnobs

Fast (Grouped) Observation Count for Matrix-Like Objects
funique

Fast Unique Elements / Rows
ldepth

Determine the Depth / Level of Nesting of a List
seqid

Generate Group-Id from Integer Sequences
fsubset

Fast Subsetting Matrix-Like Objects
qF-qG-finteraction

Fast Factor Generation, Interactions and Vector Grouping
psmat

Matrix / Array from Panel Series
small-helpers

Small (Helper) Functions
fsum

Fast (Grouped, Weighted) Sum for Matrix-Like Objects
pwcor-pwcov-pwnobs

(Pairwise, Weighted) Correlations, Covariances and Observation Counts
unlist2d

Recursive Row-Binding / Unlisting in 2D - to Data Frame
varying

Fast Check of Variation in Data
indexing

Fast Indexed Time Series and Panels
radixorder

Fast Radix-Based Ordering
qsu

Fast (Grouped, Weighted) Summary Statistics for Cross-Sectional and Panel Data
is_unlistable

Unlistable Lists
fmedian

Fast (Grouped, Weighted) Median Value for Matrix-Like Objects
fmean

Fast (Grouped, Weighted) Mean for Matrix-Like Objects
recode-replace

Recode and Replace Values in Matrix-Like Objects
quick-conversion

Quick Data Conversion
qtab

Fast (Weighted) Cross Tabulation
roworder

Fast Reordering of Data Frame Rows
rapply2d

Recursively Apply a Function to a List of Data Objects
fsummarise

Fast Summarise
wlddev

World Development Dataset
pad

Pad Matrix-Like Objects with a Value
ftransform

Fast Transform and Compute Columns on a Data Frame
psacf

Auto- and Cross- Covariance and Correlation Function Estimation for Panel Series
summary-statistics

Summary Statistics
t_list

Efficient List Transpose
rsplit

Recursive Splitting
fselect-get_vars-add_vars

Fast Select, Replace or Add Data Frame Columns
time-series-panel-series

Time Series and Panel Series
timeid

Generate Integer-Id From Time/Date Sequences