Learn R Programming

rfriend (version 1.0.0)

f_factors: Convert multiple columns to Factors in a data frame

Description

Converts multiple specified columns of a data frame into factors. If no columns are specified, it automatically detects and converts columns that are suitable to be factors. The function returns the entire data frame including non factor columns and reports the properties of this new data frame in the console.

Usage

f_factors(
  data,
  select = NULL,
  exclude = NULL,
  console = FALSE,
  force_factors = FALSE,
  unique_num_treshold = 8,
  repeats_threshold = 2,
  ...
)

Value

Returns the modified data frame with the specified (or all suitable) columns converted to factors. Can also force a print of a summary of the data frame's structure to the console (console = TRUE).

Arguments

data

A data frame containing the columns to be converted.

select

A character vector specifying the names of the columns to convert into factors. If NULL, the function automatically detects columns that should be factors based on their data type and unique value count. Default is NULL.

exclude

A character vector specifying the names of the columns NOT to convert into factors. If NULL, no columns are excluded. Default is NULL.

console

Logical. If TRUE, prints a detailed table about the properties of the new data frame to the console. Default is TRUE, if FALSE no property table will be printed to the console.

force_factors

Logical. If TRUE all columns in the data.frame will be converted to factors except for the excluded columns using exclude.

unique_num_treshold

Numeric. A threshold of the amount of unique numbers a numeric column should have to keep it numeric, i.e. omit factor conversion. Default 8.

repeats_threshold

Numeric. A threshold of the minimal number of repeats a numeric cols should have to keep convert it to a factor. Default 2.

...

Additional arguments passed to the factor() function of baseR.

Author

Sander H. van Delden plantmind@proton.me

Details

  • If select is NULL, the function identifies columns with character data or numeric data with fewer than 8 unique values as candidates for conversion to factors.

  • The function checks if all specified columns exist in the data frame and stops execution if any are missing.

  • Converts specified columns into factors, applying any additional arguments provided.

  • Outputs a summary data frame with details about each column, including its type, class, number of observations, missing values, factor levels, and labels.

See Also

Examples

Run this code
# Make a data.frame:
df <- data.frame(a = c("yes", "no", "yes", "yes", "no",
                       "yes", "yes", "no", "yes"),
                 b = c(1, 2, 3, 1, 2, 3, 1, 2, 3),
                 c = c("apple", "kiwi", "banana", "apple", "kiwi",
                        "banana", "apple", "kiwi", "banana"),
                 d = c(1.1, 1.1, 3.4, 4.5, 5.4, 6.7, 7.8, 8.1, 9.8)
)
str(df)

# Convert specified columns to factors:
df1 <- f_factors(df, select = c("a", "c"))
str(df1)


# Convert all potential factor columns to factor but exclude column "b":
df2 <- f_factors(df, exclude = c("b"))
str(df2)

# Convert all columns to factor but exclude column "b":
df3 <- f_factors(df, exclude = c("b"), force_factors = TRUE)
str(df3)

# Or automatically detect and convert suitable columns to factors.
# In this example obtaining the same results as above automatically
# and storing it in df2:
df4 <- f_factors(df)
str(df4)

# In example above col b was converted to a factor as the number of repeats = 2
# and the amount of unique numbers < 8. In order to keep b numeric we can also
# adjust the unique_num_treshold and/or repeats_threshold:
df5 <- f_factors(df, unique_num_treshold = 2)
str(df5)

Run the code above in your browser using DataLab