Learn R Programming

datawizard (version 0.4.1)

winsorize: Winsorize data

Description

Winsorize data

Usage

winsorize(data, ...)

# S3 method for numeric winsorize(data, threshold = 0.2, verbose = TRUE, ...)

Value

A dataframe with winsorized columns or a winsorized vector.

Arguments

data

Dataframe or vector.

...

Currently not used.

threshold

The amount of winsorization.

verbose

Toggle warnings.

Details

Winsorizing or winsorization is the transformation of statistics by limiting extreme values in the statistical data to reduce the effect of possibly spurious outliers. The distribution of many statistics can be heavily influenced by outliers. A typical strategy is to set all outliers (values beyond a certain threshold) to a specified percentile of the data; for example, a 90\ to the 5th percentile, and data above the 95th percentile set to the 95th percentile. Winsorized estimators are usually more robust to outliers than their more standard forms.

See Also

  • Functions to rename stuff: data_rename(), data_rename_rows(), data_addprefix(), data_addsuffix()

  • Functions to reorder or remove columns: data_reorder(), data_relocate(), data_remove()

  • Functions to reshape, pivot or rotate dataframes: data_to_long(), data_to_wide(), data_rotate()

  • Functions to recode data: data_rescale(), data_reverse(), data_cut(), data_recode(), data_shift()

  • Functions to standardize, normalize, rank-transform: center(), standardize(), normalize(), ranktransform(), winsorize()

  • Split and merge dataframes: data_partition(), data_merge()

  • Functions to find or select columns: data_select(), data_find()

  • Functions to filter rows: data_match(), data_filter()

Examples

Run this code
winsorize(iris$Sepal.Length, threshold = 0.2)
winsorize(iris, threshold = 0.2)

Run the code above in your browser using DataLab