criteria: Criteria functions

Description

Produce criteria which could be used in the different situations - see 'recode', 'na_if', 'count_if', 'match_row', '%i%' and etc. For example, 'greater(5)' returns function which tests whether its argument greater than five. 'fixed("apple")' returns function which tests whether its argument contains "apple". For criteria logical operations (|, &, !, xor) are defined, e. g. you can write something like: 'greater(5) | equals(1)'. List of functions:

comparison criteria - 'equals', 'greater' and etc. return functions which compare its argument against value.
'thru' checks whether a value is inside interval. 'thru(0,1)' is equivalent to 'x>=0 & x<=1'
'%thru%' is infix version of 'thru', e. g. '0 %thru% 1'
'is_max' and 'is_min' return TRUE where vector value is equals to maximum or minimum.
'contains' searches for the pattern in the strings. By default, it works with fixed patterns rather than regular expressions. For details about its arguments see grepl
'like' searches for the Excel-style pattern in the strings. You can use wildcards: '*' means any number of symbols, '?' means single symbol. Case insensitive.
'fixed' alias for contains.
'perl' such as 'contains' but the pattern is perl-compatible regular expression ('perl = TRUE'). For details see grepl
'regex' use POSIX 1003.2 extended regular expressions ('fixed = FALSE'). For details see grepl
'has_label' searches values which have supplied label(-s). We can used criteria as an argument for 'has_label'.
'to' returns function which gives TRUE for all elements of vector before the first occurrence of 'x' and for 'x'.
'from' returns function which gives TRUE for all elements of vector after the first occurrence of 'x' and for 'x'.
'not_na' returns TRUE for all non-NA vector elements.
'other' returns TRUE for all vector elements. It is intended for usage with 'recode'.
'items' returns TRUE for the vector elements with the given sequential numbers.
'and', 'or', 'not' are spreadsheet-style boolean functions.

Shortcuts for comparison criteria:

'equals' - 'eq'
'not_equals' - 'neq', 'ne'
'greater' - 'gt'
'greater_or_equal' - 'gte', 'ge'
'less' - 'lt'
'less_or_equal' - 'lte', 'le'

Usage

as.criterion(crit)
is.criterion(x)
equals(x)
not_equals(x)
less(x)
less_or_equal(x)
greater(x)
greater_or_equal(x)
thru(lower, upper)
lower %thru% upper
when(x)
is_max(x)
is_min(x)
contains(
  pattern,
  ignore.case = FALSE,
  perl = FALSE,
  fixed = TRUE,
  useBytes = FALSE
)
like(pattern)
fixed(
  pattern,
  ignore.case = FALSE,
  perl = FALSE,
  fixed = TRUE,
  useBytes = FALSE
)
perl(
  pattern,
  ignore.case = FALSE,
  perl = TRUE,
  fixed = FALSE,
  useBytes = FALSE
)
regex(
  pattern,
  ignore.case = FALSE,
  perl = FALSE,
  fixed = FALSE,
  useBytes = FALSE
)
has_label(x)
from(x)
to(x)
items(...)
not_na(x)
is_na(x)
other(x)
and(...)
or(...)
not(x)

Value

function of class 'criterion' which tests its argument against condition and return logical value

Arguments

crit: vector of values/function which returns logical or logical vector. It will be converted to function of class criterion.
x: vector
lower: vector/single value - lower bound of interval
upper: vector/single value - upper bound of interval
pattern: character string containing a regular expression (or character string for 'fixed') to be matched in the given character vector. Coerced by as.character to a character string if possible.
ignore.case: logical see grepl
perl: logical see grepl
fixed: logical see grepl
useBytes: logical see grepl
...: numeric indexes of desired items for items, logical vectors or criteria for boolean functions.

Examples

Run this code

# operations on vector, '%d%' means 'diff'
1:6 %d% greater(4) # 1:4
1:6 %d% (1 | greater(4)) # 2:4
# '%i%' means 'intersect
1:6 %i% (is_min() | is_max()) # 1, 6
# with Excel-style boolean operators
1:6 %i% or(is_min(), is_max()) # 1, 6

letters %i% (contains("a") | contains("z")) # a, z

letters %i% perl("a|z") # a, z

letters %i% from("w")  # w, x, y, z

letters %i% to("c")  # a, b, c

letters %i% (from("b") & to("e"))  # b, d, e

c(1, 2, NA, 3) %i% not_na() # c(1, 2, 3)

# examples with count_if
df1 = data.frame(
    a=c("apples", "oranges", "peaches", "apples"),
    b = c(32, 54, 75, 86)
)

count_if(greater(55), df1$b) # greater than 55 = 2

count_if(not_equals(75), df1$b) # not equals 75 = 3

count_if(greater(32) & less(86), df1$b) # greater than 32 and less than 86 = 2
count_if(and(greater(32), less(86)), df1$b) # the same result

# infix version
count_if(35 %thru% 80, df1$b) # greater than or equals to 35 and less than or equals to 80 = 2

# values that started on 'a'
count_if(like("a*"), df1) # 2

# the same with Perl-style regular expression
count_if(perl("^a"), df1) # 2

# count_row_if
count_row_if(perl("^a"), df1) # c(1,0,0,1)

# examples with 'n_intersect' and 'n_diff'
data(iris)
iris %>% n_intersect(to("Petal.Width")) # all columns up to 'Species' 
 
# 'Sepal.Length', 'Sepal.Width' will be left 
iris %>% n_diff(from("Petal.Length"))

# except first column
iris %n_d% items(1)

# 'recode' examples
qvar = c(1:20, 97, NA, NA)
recode(qvar, 1 %thru% 5 ~ 1, 6 %thru% 10 ~ 2, 11 %thru% hi ~ 3, other ~ 0)
# the same result
recode(qvar, 1 %thru% 5 ~ 1, 6 %thru% 10 ~ 2, greater_or_equal(11) ~ 3, other ~ 0)

Run the code above in your browser using DataLab

Description

Usage

Value

Arguments

See Also

Examples