uniquereturns a vector, data frame or array like
xbut with duplicate elements/rows removed.
unique(x, incomparables = FALSE, …)
# S3 method for default unique(x, incomparables = FALSE, fromLast = FALSE, nmax = NA, …)
# S3 method for matrix unique(x, incomparables = FALSE, MARGIN = 1, fromLast = FALSE, …)
# S3 method for array unique(x, incomparables = FALSE, MARGIN = 1, fromLast = FALSE, …)
FALSEis a special value, meaning that all values can be compared, and may be the only value accepted for methods other than the default. It will be coerced internally to the same type as
x, but with only one copy of each duplicated element. No attributes are copied (so the result has no names). For a data frame, a data frame is returned with the same columns but possibly fewer rows (and with row names from the first occurrences of the unique rows). A matrix or array is subsetted by
[, drop = FALSE], so dimensions and dimnames are copied appropriately, and the result always has the same number of dimensions as
vector) or differ only in their attributes. In the worst case it is \(O(n^2)\).
MARGINif the remaining dimensions are identical to those for an earlier element (in row-major order). This would most commonly be used for matrices to find unique rows (the default) or columns (with
MARGIN = 2). Note that unlike the Unix command
uniqthis omits duplicated and not just repeated elements/rows. That is, an element is omitted if it is equal to any previous element and not just if it is equal the immediately previous one. (For the latter, see
rle). Missing values (
"NA") are regarded as equal, numeric and complex ones differing from
NaN; character strings will be compared in a “common encoding”; for details, see
duplicated) which use the same concept. Values in
incomparableswill never be marked as duplicated. This is intended to be used for a fairly small set of values and will not be efficient for a very large set. When used on a data frame with more than one column, or an array or matrix when comparing dimensions of length greater than one, this tests for identity of character representations. This will catch people who unwisely rely on exact equality of floating-point numbers!
duplicatedwhich gives the indices of duplicated elements.
rlewhich is the equivalent of the Unix
x <- c(3:5, 11:8, 8 + 0:5) (ux <- unique(x)) (u2 <- unique(x, fromLast = TRUE)) # different order stopifnot(identical(sort(ux), sort(u2))) length(unique(sample(100, 100, replace = TRUE))) ## approximately 100(1 - 1/e) = 63.21 unique(iris)
Run the code above in your browser using DataCamp Workspace