
Convert dichotomy data.frame/matrix to data.frame with category encoding
as.category(x, prefix = NULL, counted_value = 1, compress = FALSE)is.category(x)
Dichotomy data.frame/matrix (usually with 0,1 coding).
If is not NULL then column names will be added in the form prefix+column number.
Vector. Values that will be considered as indicator of category presence. By default it equals to 1.
Logical. Should we drop columns with all NA? FALSE by default. TRUE significantly decreases performance of the function.
data.frame of class category
with numeric values
that correspond to column numbers of counted values. Column names of x or
variable labels are added as value labels.
as.dichotomy
for reverse conversion, mrset,
mdset for usage multiple-response variables with tables.
# NOT RUN {
set.seed(123)
# Let's imagine it's matrix of consumed products
dichotomy_matrix = matrix(sample(0:1,40,replace = TRUE,prob=c(.6,.4)),nrow=10)
colnames(dichotomy_matrix) = c("Milk","Sugar","Tea","Coffee")
as.category(dichotomy_matrix, compress = TRUE) # compressed version
category_encoding = as.category(dichotomy_matrix)
# should be TRUE
identical(val_lab(category_encoding), c(Milk = 1L, Sugar = 2L, Tea = 3L, Coffee = 4L))
all(as.dichotomy(category_encoding, use_na = FALSE) == dichotomy_matrix)
# with prefix
as.category(dichotomy_matrix, prefix = "products_")
# data.frame with variable labels
dichotomy_dataframe = as.data.frame(dichotomy_matrix)
colnames(dichotomy_dataframe) = paste0("product_", 1:4)
var_lab(dichotomy_dataframe[[1]]) = "Milk"
var_lab(dichotomy_dataframe[[2]]) = "Sugar"
var_lab(dichotomy_dataframe[[3]]) = "Tea"
var_lab(dichotomy_dataframe[[4]]) = "Coffee"
as.category(dichotomy_dataframe, prefix = "products_")
# }
Run the code above in your browser using DataLab