statar v0.1.1


by Matthieu Gomez

Tools Inspired by Stata to Clean, Explore and Join Datasets

statar makes it easier to work with tabular datasets. statar includes a set of functions to clean and summarize variables, to join datasets with an SQL-syntax, and to manipulate datasets with a panel structure (elapsed dates, lead/lag, fill rows for missing dates, fill in missing values). statar is based on the data.table package and is inspired by Stata.

Functions in statar

Name Description
tempname Create unique names within a list, a data.frame, or an environment
bin Bin a numeric vector and return integer codes for the binning (corresponds to Stata command xtile)
fill_gap Add rows corresponding to gaps in some variable
elapsed Elapsed dates (weekly, monthly, quarterly)
sum_up Gives summary statistics (corresponds to Stata command summarize)
tag Creates a vector of zero except for one subscript
lead-lag lead and lag.
setmutate_each Version of mutate_each that (i) transforms data.table in place (ii) allows by, i condition (iii) when only when fun, creates new variables - except when replace = TRUE
winsorize Winsorize a numeric vector
statar A package for applied research
setkeep Keep only certain columns in place
join Join two data.tables together
floor_date floor_date Round date-times down.
demean Demean a vector
graph Experimental function to graph a dataset
sample_mode Statistical mode
setmutate Version of mutate that (i) transforms data.table in place (ii) allows by, i condition
duplicates returns duplicated rows
setna fill NA in place based on non missing observations
setcols Retain certain columns of a data.table in place (= Stata keep).
setdrop Drop certain columns in place
find_duplicates returns duplicated rows
roll_lag Apply rollling functions with respect to a time variable
pastem String and expression interpolation
