Learn R Programming

bdsm (version 0.3.0)

feature_standardization: Perform feature standardization

Description

This function performs feature standardization (also known as z-score normalization) by centering the features around their mean and scaling by their standard deviation.

Usage

feature_standardization(df, excluded_cols, group_by_col, scale = TRUE)

Value

A data frame with standardized features.

Arguments

df

Data frame with the data.

excluded_cols

Unquoted column names to exclude from standardization. If missing, all columns are standardized.

group_by_col

Unquoted column names to group the data by before applying standardization. If missing, no grouping is performed.

scale

Logical. If TRUE (default) scales by the standard deviation.

Examples

Run this code
df <- data.frame(
  year = c(2000, 2001, 2002, 2003, 2004),
  country = c("A", "A", "B", "B", "C"),
  gdp = c(1, 2, 3, 4, 5),
  ish = c(2, 3, 4, 5, 6),
  sed = c(3, 4, 5, 6, 7)
)

# Standardize every column
df_with_only_numeric_values <- df[, setdiff(names(df), "country")]
feature_standardization(df_with_only_numeric_values)

# Standardize all columns except 'country'
feature_standardization(df, excluded_cols = country)

# Standardize across countries (grouped by 'country')
feature_standardization(df, group_by_col = country)

# Standardize, excluding 'country' and group-wise by 'year'
feature_standardization(df, excluded_cols = country, group_by_col = year)

Run the code above in your browser using DataLab