Learn R Programming

⚠️There's a newer version (2.8.10) of this package.Take me there.

sjmisc - Data and Variable Transformation Functions

           

Data preparation is a common task in research, which usually takes the most amount of time in the analytical process. Packages for data preparation have been released recently as part of the tidyverse, focussing on the transformation of data sets. Packages with special focus on transformation of variables, which fit into the workflow and design-philosophy of the tidyverse, are missing.

sjmisc tries to fill this gap. Basically, this package complements the dplyr package in that sjmisc takes over data transformation tasks on variables, like recoding, dichotomizing or grouping variables, setting and replacing missing values, etc. A distinctive feature of sjmisc is the support for labelled data, which is especially useful for users who often work with data sets from other statistical software packages like SPSS or Stata.

The functions of sjmisc are designed to work together seamlessly with other packages from the tidyverse, like dplyr. For instance, you can use the functions from sjmisc both within a pipe-workflow to manipulate data frames, or to create new variables with mutate(). See vignette("design_philosophy", "sjmisc") for more details.

Contributing to the package

Please follow this guide if you like to contribute to this package.

Installation

Latest development build

To install the latest development snapshot (see latest changes below), type following commands into the R console:

library(devtools)
devtools::install_github("strengejacke/sjmisc")

Officiale, stable release

To install the latest stable release from CRAN, type following command into the R console:

install.packages("sjmisc")

References, documentation and examples

A cheatsheet can be downloaded from here (PDF) or from the RStudio cheatsheet collection.

For more examples, see package vignettes (browseVignettes("sjmisc")).

Please visit https://strengejacke.github.io/sjmisc/ for documentation and vignettes.

Citation

In case you want / have to cite my package, please cite as (see also citation('sjmisc')):

Lüdecke D (2018). sjmisc: Data and Variable Transformation Functions. Journal of Open Source Software, 3(26), 754. doi: 10.21105/joss.00754

Copy Link

Version

Install

install.packages('sjmisc')

Monthly Downloads

34,037

Version

2.8.3

License

GPL-3

Maintainer

Daniel Lüdecke

Last Published

January 10th, 2020

Functions in sjmisc (2.8.3)

big_mark

Format numbers
all_na

Check if vector only has NA values
de_mean

Compute group-meaned and de-meaned variables
dicho

Dichotomize variables
efc

Sample dataset from the EUROFAMCARE project
count_na

Frequency table of tagged NA values
add_variables

Add variables or cases to data frames
add_columns

Add or replace data frame columns
add_rows

Merge labelled data frames
descr

Basic descriptive statistics
is_crossed

Check whether two factors are crossed or nested
frq

Frequency table of labelled variables
flat_table

Flat (proportional) tables
find_var

Find variable by name or label
empty_cols

Return or remove variables or observations that are completely missing
is_empty

Check whether string, list or vector is empty
group_var

Recode numeric variables into equal-ranged groups
has_na

Check if variables or cases have missing / infinite values
is_num_fac

Check whether a factor has numeric levels only
merge_imputations

Merges multiple imputed data frames into a single data frame
rec_pattern

Create recode pattern for 'rec' function
rec

Recode variables
recode_to

Recode variable categories into new values
reexports

Objects exported from other packages
group_str

Group near elements of string vectors
is_float

Check if a variable is of (non-integer) double type or a whole number
is_even

Check whether value is even or odd
%nin%

Value matching
move_columns

Move columns to other positions in a data frame
rotate_df

Rotate a data frame
row_count

Count row or column indices
numeric_to_factor

Convert numeric vectors into factors associated value labels
reshape_longer

Reshape data into long format
replace_na

Replace NA with specific values
remove_var

Remove variables from a data frame
ref_lvl

Change reference level of (numeric) factors
shorten_string

Shorten character strings
typical_value

Return the typical value of a vector
split_var

Split numeric variables into smaller groups
set_na_if

Replace specific values in vector with NA
sjmisc-package

Data and Variable Transformation Functions
seq_col

Sequence generation for column or row counts of data frames
row_sums

Row sums and means for data frames
spread_coef

Spread model coefficients of list-variables into columns
var_rename

Rename variables
round_num

Round numeric variables in a data frame
std

Standardize and center variables
to_dummy

Split (categorical) vectors into dummy variables
to_factor

Convert variable into factor and keep value labels
str_contains

Check if string contains pattern
to_label

Convert variable into factor with associated value labels
tidy_values

Clean values of character vectors.
to_long

Convert wide data to long format
to_character

Convert variable into character vector and replace values with associated value labels
zap_inf

Convert infiite or NaN values into regular NA
to_value

Convert factors to numeric variables
str_find

Find partial matching and close distance elements in strings
trim

Trim leading and trailing whitespaces from strings
str_start

Find start and end index of pattern in string
var_type

Determine variable type
word_wrap

Insert line breaks in long labels