SparkR (version 2.1.2)

approxCountDistinct: Returns the approximate number of distinct items in a group

Description

Returns the approximate number of distinct items in a group. This is a column aggregate function.

Usage

approxCountDistinct(x, ...)

# S4 method for Column approxCountDistinct(x, rsd = 0.05)

# S4 method for Column approxCountDistinct(x, rsd = 0.05)

Arguments

x

Column to compute on.

...

further arguments to be passed to or from other methods.

rsd

maximum estimation error allowed (default = 0.05)

Value

the approximate number of distinct items in a group.

Examples

Run this code
# NOT RUN {
approxCountDistinct(df$c)
# }
# NOT RUN {
approxCountDistinct(df$c, 0.02)
# }

Run the code above in your browser using DataCamp Workspace