Learn R Programming

ubair (version 1.1.1)

scale_data: Standardize Training and Application Data

Description

This function standardizes numeric columns of the train_data and applies the same scaling (mean and standard deviation) to the corresponding columns in apply_data. It returns the standardized data along with the scaling parameters (means and standard deviations). This is particularly important for neural network approaches as they tend to be numerically unstable and deteriorate otherwise.

Usage

scale_data(train_data, apply_data)

Value

A list containing the following elements:

train

The standardized training data.

apply

The apply_data scaled using the means and standard deviations from the train_data.

means

The means of the numeric columns in train_data.

sds

The standard deviations of the numeric columns in train_data.

Arguments

train_data

A data frame containing the training dataset to be standardized. It must contain numeric columns.

apply_data

A data frame containing the dataset to which the scaling from train_data will be applied.

Examples

Run this code
data(mock_env_data)
detrended_list <- list(
  train = mock_env_data[1:80, ],
  apply = mock_env_data[81:100, ]
)
scale_result <- scale_data(
  train_data = detrended_list$train,
  apply_data = detrended_list$apply
)
scaled_train <- scale_result$train
scaled_apply <- scale_result$apply

Run the code above in your browser using DataLab