Select distinct/unique rows.

Retain only unique/distinct rows from an input tbl. This is similar to unique.data.frame, but considerably faster.

distinct(.data, ..., .keep_all = FALSE)

distinct_(.data, ..., .dots, .keep_all = FALSE)


a tbl


Optional variables to use when determining uniqueness. If there are multiple rows for a given combination of inputs, only the first row will be preserved. If omitted, will use all variables.


If TRUE, keep all variables in .data. If a combination of ... is not distinct, this keeps the first row of values.


Used to work around non-standard evaluation. See vignette("nse") for details.

  • distinct
  • distinct_
library(dplyr) df <- data.frame( x = sample(10, 100, rep = TRUE), y = sample(10, 100, rep = TRUE) ) nrow(df) nrow(distinct(df)) nrow(distinct(df, x, y)) distinct(df, x) distinct(df, y) # Can choose to keep all other variables as well distinct(df, x, .keep_all = TRUE) distinct(df, y, .keep_all = TRUE) # You can also use distinct on computed variables distinct(df, diff = abs(x - y))
Documentation reproduced from package dplyr, version 0.5.0, License: MIT + file LICENSE

Community examples

Looks like there are no examples yet.