Learn R Programming

parameters (version 0.2.0)

standardize: Standardization (Z-scoring)

Description

Performs a standardization of data (Z-scoring), i.e., centred and scaled, so that the data is expressed in terms of standard deviation (i.e., mean = 0, SD = 1) or Median Absolute Deviance (median = 0, MAD = 1). When applied to a statistical model, this function extracts the dataset, standardizes it, and refits the model with this standardized version of the dataset. The normalize function can also be used to scale all numeric variables within the 0 - 1 range.

Usage

standardize(x, ...)

# S3 method for numeric standardize(x, robust = FALSE, method = "default", verbose = TRUE, ...)

# S3 method for factor standardize(x, force = FALSE, ...)

# S3 method for data.frame standardize(x, robust = FALSE, method = "default", select = NULL, exclude = NULL, verbose = TRUE, force = FALSE, ...)

# S3 method for lm standardize(x, robust = FALSE, method = "default", include_response = TRUE, verbose = TRUE, ...)

Arguments

x

A dataframe, a vector or a statistical model.

...

Arguments passed to or from other methods.

robust

Logical, if TRUE, centering is done by substracting the median from the variables and divide it by the median absolute deviation (MAD). If FALSE, variables are standardized by substracting the mean and divide it by the standard deviation (SD).

method

The method of standardization. For data.frames, can be "default" (variables are centred by mean or median, and divided by SD or MAD, depending on robust) or "2sd", in which case they are divided by two times the deviation (SD or MAD, again depending on robust).

verbose

Toggle warnings on or off.

force

Logical, if TRUE, forces standardization of factors as well. Factors are converted to numerical values, with the lowest level being the value 1 (unless the factor has numeric levels, which are converted to the corresponding numeric value).

select

For a data frame, character vector of column names to be standardized. If NULL (the default), all variables will be standardized.

exclude

For a data frame, character vector of column names to be excluded from standardization.

include_response

For a model, if TRUE (default), the response value will also be standardized. If FALSE, only the predictors will be standardized. Note that for certain models (logistic regression, count models, ...), the response value will never be standardized, to make re-fitting the model work.

Value

The standardized object (either a standardize dataframe or a statistical model fitted on standardized data).

See Also

normalize parameters_standardize

Examples

Run this code
# NOT RUN {
# Dataframes
summary(standardize(iris))

# Models
model <- lm(Sepal.Length ~ Species * Petal.Width, data = iris)
coef(standardize(model))
# }

Run the code above in your browser using DataLab