Learn R Programming

fastDummies (version 0.1.2)

dummy_cols: Fast creation of dummy variables

Description

Fast creation of dummy variables

Usage

dummy_cols(dataset, select_columns = NULL, ignore_columns = NULL,
  remove_original = TRUE, dummy_columns_only = FALSE,
  remove_first_dummy = FALSE, conditional_columns = NULL,
  return_type = "data.table")

Arguments

dataset

data.table or data.frame

select_columns

Vector of column names that you want to create dummy variables from. Default uses all character or factor columns.

ignore_columns

Vector of column names to ignore_ Default ignores all numeric columns.

remove_original

Removes the columns used to make dummy variables. Columns that are not used to make dummy variables are not affected.

dummy_columns_only

Removes all columns that didn't create dummy columns (i_e_ numeric columns).

remove_first_dummy

Removes the first dummy of every variable that only n-1 Dummies remain

conditional_columns

Select column(s) to multiple other dummy columns created by. Useful to get subcategories of data. e.g. conditional column is gender and other columns are crimes. This will create columns showing the number of each crime for each gender.

return_type

Type of data you want back_ Default is data.table (better for use with large data)_ Other options are data.frame or matrix.

Value

data.table, data.frame, or matrix depending on input for return_type. data.table is default.

Examples

Run this code
# NOT RUN {
data(dummies_example)
example <- dummy_cols(dummies_example)

# Return data.frame
example <- dummy_cols(dummies_example, return_type = "data.frame")

# Only keep created dummy columns
example <- dummy_cols(dummies_example, dummy_columns_only = TRUE)

# Only keep SEX and RACE columns
example <- dummy_cols(dummies_example, select_columns = c("Sex", "RACE"))

# Keep all except SEX column
example <- dummy_cols(dummies_example, ignore_columns = "SEX")

# Removes the first dummy from every category. Avoids perfect
# multicollinearity issues in models.
example <- dummy_cols(dummies_example, remove_first_dummy = TRUE)
# }

Run the code above in your browser using DataLab