Learn R Programming

smartdata (version 1.0.3)

clean_outliers: Outliers cleaning wrapper

Description

Outliers cleaning wrapper

Usage

clean_outliers(dataset, method, ...)

Arguments

dataset

we want to clean outliers of

method

selected method to clean outliers. Possibilities are:

  • "univariate" detects outliers column by column (an outlier will be an abnormal value inside a column) and fills them with mean or median of the corresponding column

  • "multivariate" detects outliers using a multicolumn approach, so that an outlier will be a whole observation (row). And deletes those observations

...

further arguments for the method

Value

The treated dataset (either with outliers replaced or erased)

Examples

Run this code
# NOT RUN {
library("smartdata")

super_iris <- clean_outliers(iris, method = "multivariate", type = "adj")
super_iris <- clean_outliers(iris, method = "multivariate", type = "quan")

# Use mean as method to substitute outliers
super_iris <- clean_outliers(iris, method = "univariate", type = "z", prob = 0.9, fill = "mean")
# Use median as method to substitute outliers
super_iris <- clean_outliers(iris, method = "univariate", type = "z", prob = 0.9, fill = "median")
# Use chi-sq instead of z p-values
super_iris <- clean_outliers(iris, method = "univariate", type = "chisq",
                             prob = 0.9, fill = "median")
# Use interquartilic range instead (lim argument is mandatory when using it)
super_iris <- clean_outliers(iris, method = "univariate", type = "iqr", lim = 0.9, fill = "median")

# }

Run the code above in your browser using DataLab