Truncation based on the interquartile range to be applied to a dataset.
process_truncate_by_iqr(x, truncate_multipliers = NA, only_numeric = TRUE)
Matrix or data.frame of same dimensions as input.
Matrix or Data.frame.
Vector of truncation parameters. Either a single value which is
replicated as necessary or of same dimension as ncol(x)
.
If any vector entry is NA, the corresponding column will not be
truncated. If named, then the names must correspond to columnnames in x
,
and only specified columns will be processed. See details.
If TRUE and if x
is a data.frame, then only columns of type numeric
will
be processed. Otherwise all columns will be processed (e.g. also in the
case that x
is a matrix).
Truncation is processed as follows:
Compute the 1st and 3rd quartile q1 / q3 of variables in x
.
Multiply these quantities by values in truncate_multipliers
to obtain
L and U. If a value is NA, the corresponding variable is not truncated.
Set any value smaller / larger than L / U to L / U.
Truncation multipliers can be specified in three ways (note that whenever
only_numeric
is set to TRUE, then only numeric columns are affected):
A single numeric - then all columns will be processed in the same way
A numeric vector without names - it is assumed that the length can be
replicated to the number of columns in x
, each column is processed by the
corresponding value in the vector
A numeric vector with names - length can differ from the columns in
x
and only the columns for which the names occur in the vector are
processed