fmode
is a generic function and returns the (column-wise) statistical mode i.e. the most frequent value of x
, (optionally) grouped by g
and/or weighted by w
.
The TRA
argument can further be used to transform x
using its (grouped, weighted) mode.
fmode(x, ...)# S3 method for default
fmode(x, g = NULL, w = NULL, TRA = NULL, na.rm = TRUE,
use.g.names = TRUE, ...)
# S3 method for matrix
fmode(x, g = NULL, w = NULL, TRA = NULL, na.rm = TRUE,
use.g.names = TRUE, drop = TRUE, ...)
# S3 method for data.frame
fmode(x, g = NULL, w = NULL, TRA = NULL, na.rm = TRUE,
use.g.names = TRUE, drop = TRUE, ...)
# S3 method for grouped_df
fmode(x, w = NULL, TRA = NULL, na.rm = TRUE,
use.g.names = FALSE, keep.group_vars = TRUE, keep.w = TRUE, ...)
a vector, matrix, data.frame or grouped tibble (dplyr::grouped_df
).
a numeric vector of (non-negative) weights, may contain missing values.
an integer or quoted operator indicating the transformation to perform:
1 - "replace_fill" | 2 - "replace" | 3 - "-" | 4 - "-+" | 5 - "/" | 6 - "%" | 7 - "+" | 8 - "*" | 9 - "%%" | 10 - "-%%". See TRA
.
logical. Skip missing values in x
. Defaults to TRUE
and implemented at very little computational cost. If na.rm = FALSE
, NA
is treated as any other value.
make group-names and add to the result as names (vector method) or row-names (matrix and data.frame method). No row-names are generated for data.tables and grouped tibbles.
matrix and data.frame method: drop dimensions and return an atomic vector if g = NULL
and TRA = NULL
.
grouped_df method: Logical. FALSE
removes grouping variables after computation.
grouped_df method: Logical. Retain sum
of weighting variable after computation (if contained in grouped_df
).
arguments to be passed to or from other methods.
The statistical mode of x
, grouped by g
, or (if TRA
is used) x
transformed by its mode, grouped by g
. See also Details.
fmode
implements a pretty fast algorithm to find the statistical mode utilizing index- hashing implemented in the Rcpp::sugar::IndexHash
class.
If all values are distinct, the first value is returned. If there are multiple distinct values having the top frequency, the first value established as having the top frequency when passing through the data from element 1 to element n is returned. If na.rm = FALSE
, NA
is not removed but treated as any other value (i.e. it's frequency is counted). If all values are NA
, NA
is always returned.
The weighted mode is computed by summing up the weights for all distinct values and choosing the value with the largest sum. If na.rm = TRUE
, missing values will be removed from both x
and w
i.e. utilizing only x[complete.cases(x,w)]
and w[complete.cases(x,w)]
.
This all seamlessly generalizes to grouped computations, which are currently performed by mapping the data to a sparse-array directed by g
and then going group-by group.
fmode
preserves all the attributes of the objects it is applied to (apart from names or row-names which are adjusted as necessary). If a data frame is passed to fmode
and drop = TRUE
, base::unlist
will be called on the result, which might or might not be sensible depending on the data at hand.
fmean
, fmedian
, Fast Statistical Functions, Collapse Overview
# NOT RUN {
## World Development Data
attach(wlddev)
## default vector method
fmode(PCGDP) # Numeric mode
fmode(PCGDP, iso3c) # Grouped numeric mode
fmode(PCGDP, iso3c, LIFEEX) # Grouped and weighted numeric mode
fmode(region) # Factor mode
fmode(date) # Date mode (defaults to first value since panel is balanced)
fmode(country) # Character mode (also defaults to first value)
fmode(OECD) # Logical mode
# ...all the above can also be performed grouped and weighted
## matrix method
m <- qM(airquality)
fmode(m)
fmode(m, na.rm = FALSE) # NA frequency is also counted
fmode(m, airquality$Month) # Groupwise
fmode(m, w = airquality$Day) # Weighted: Later days in the month are given more weight
fmode(m>50, airquality$Month) # Groupwise logical mode
# etc ...
## data.frame method
fmode(wlddev) # Gives one row
fmode(wlddev, drop = TRUE) # calling unlist -> coerce to character vector
fmode(wlddev, iso3c) # Grouped mode
fmode(wlddev, iso3c, LIFEEX) # Grouped and weighted mode
detach(wlddev)
# }
Run the code above in your browser using DataLab