DescTools (version 0.99.19)

Extremes: Kth Smallest/Largest Values

Description

Find the kth smallest, resp. largest values from a vector x and return the values, but also their frequencies.

Usage

Small(x, k = 5, unique = FALSE, na.last = NA) Large(x, k = 5, unique = FALSE, na.last = NA)
HighLow(x, nlow = 5, nhigh = nlow, na.last = NA)

Arguments

x
a numeric vector

k
an integer >0 defining how many extreme values should be returned. Default is k = 5. If k > length(x), all values will be returned.

unique
logical, defining if unique values should be considered or not. If this is set to TRUE, a list with the k extreme values and their frequencies is returned. Default is FALSE (as unique is a rather expensive function).

na.last
for controlling the treatment of NAs. If TRUE, missing values in the data are put last; if FALSE, they are put first; if NA, they are removed.
nlow
a single integer. The number of the smallest elements of a vector to be printed. Defaults to 5.
nhigh
a single integer. The number of the greatest elements of a vector to be printed. Defaults to the number of nlow.

Value

unique is set to FALSE: a vector with the k most extreme values,else: a list, containing the k most extreme values and their frequencies.

Details

There are several points of this problem discussed out there. This implementation is based on effective C++ code, which is quite fast.

HighLow enumerates the k extreme values (both sides) and their frequencies (in brackets). It is used for describing univariate variables and is interesting for checking the ends of the vector, where in real data often wrong values accumulate. This is merely a printing routine for the highest and the lowest values of x.

References

http://stackoverflow.com/questions/36993935/find-the-largest-n-unique-values-and-their-frequencies-in-r-and-rcpp/

http://gallery.rcpp.org/articles/top-elements-from-vectors-using-priority-queue/

See Also

max, max, sort, rank

Examples

Run this code
x <- sample(1:10, 1000, rep=TRUE)
Large(x, 3)
Large(x, k=3, unique=TRUE)

# works fine up to x ~ 1e6
x <- runif(1000000)
Small(x, 3, unique=TRUE)
Small(x, 3, unique=FALSE)

# Both ends
cat(HighLow(d.pizza$temperature, na.last=NA))

Run the code above in your browser using DataCamp Workspace