data_match: Return filtered data frame or row indices

Description

Return a filtered data frame or row indices of a data frame that match a specific condition. data_filter() works like data_match(), but works with logical expressions instead of data frame to specify matching conditions.

Usage

data_match(x, to, match = "and", return_indices = FALSE, ...)
data_filter(x, filter, ...)

Arguments

A data frame.

A data frame matching the specified conditions. Note that if match is a value other than "and", the original row order might be changed. See 'Details'.

match

String, indicating with which logical operation matching conditions should be combined. Can be "and" (or "&"), "or" (or "|") or "not" (or "!").

return_indices

Logical, if FALSE, return the vector of rows that can be used to filter the original data frame. If FALSE (default), returns directly the filtered data frame instead of the row indices.

...

Not used.

filter

A logical expression indicating which rows to keep.

Value

A filtered data frame, or the row indices that match the specified configuration.

Details

For data_match(), if match is either "or" or "not", the original row order from x might be changed. If preserving row order is required, use data_filter() instead.

# mimics subset() behaviour, preserving original row order
head(data_filter(mtcars[c("mpg", "vs", "am")], vs == 0 | am == 1))
#>                    mpg vs am
#> Mazda RX4         21.0  0  1
#> Mazda RX4 Wag     21.0  0  1
#> Datsun 710        22.8  1  1
#> Hornet Sportabout 18.7  0  0
#> Duster 360        14.3  0  0
#> Merc 450SE        16.4  0  0
# re-sorting rows
head(data_match(mtcars[c("mpg", "vs", "am")],
                data.frame(vs = 0, am = 1),
                match = "or"))
#>                    mpg vs am
#> Mazda RX4         21.0  0  1
#> Mazda RX4 Wag     21.0  0  1
#> Hornet Sportabout 18.7  0  0
#> Duster 360        14.3  0  0
#> Merc 450SE        16.4  0  0
#> Merc 450SL        17.3  0  0

While data_match() works with data frames to match conditions against, data_filter() is basically a wrapper around subset(subset = <filter>). However, unlike subset(), it preserves label attributes and is useful when working with labelled data.

Examples

Run this code

# NOT RUN {
data_match(mtcars, data.frame(vs = 0, am = 1))
data_match(mtcars, data.frame(vs = 0, am = c(0, 1)))

# observations where "vs" is NOT 0 AND "am" is NOT 1
data_match(mtcars, data.frame(vs = 0, am = 1), match = "not")
# equivalent to
data_filter(mtcars, vs != 0 & am != 1)

# observations where EITHER "vs" is 0 OR "am" is 1
data_match(mtcars, data.frame(vs = 0, am = 1), match = "or")
# equivalent to
data_filter(mtcars, vs == 0 | am == 1)

# }

Run the code above in your browser using DataLab