Learn R Programming

expss (version 0.5.1)

if_val: Change, rearrange or consolidate the values of an existing/new variable. Inspired by RECODE command from SPSS.

Description

if_val change, rearrange or consolidate the values of an existing variable based on conditions. Design of this function inspired by RECODE from SPSS. Sequence of recodings provided in the form of formulas. For example, 1:2 ~ 1 means that all 1 and 2 will be replaced with 1. Each value recoded only once. Values which doesn't meet any condition remain unchanged. As a condition one can use just values or more sophisticated logical values and functions. There are several special functions for usage as criteria - for details see criteria. Simple common usage looks like: if_val(x, 1:2 ~ -1, 3 ~ 0, 1:2 ~ 1, 99 ~ NA). For more information, see details and examples. The ifs function checks whether one or more conditions are met and returns a value that corresponds to the first TRUE condition. ifs can take the place of multiple nested ifelse statements, and is much easier to read with multiple conditions. ifs works in the same manner as if_val - e. g. with formula or from/to notation. But conditions should be only logical and it doesn't operate on multicolumn objects.

Usage

if_val(x, ..., from = NULL, to = NULL)
if_val(x, from = NULL) <- value
ifs(..., from = NULL, to = NULL, default = NA)
lo
hi

Arguments

x
vector/matrix/data.frame/list
...
sequence of formulas which describe recodings. Only used when from/to arguments are not provided.
from
list of conditions for values which should be recoded (in the same format as LHS of formulas).
to
list of values into which old values should be recoded (in the same format as RHS of formulas).
value
list with formulas which describe recodings in assignment form of function/to list if from/to notation is used.
default
single value or vector. Default value - NA. This value will be used for values of result with all conditions FALSE/NA.

Value

object of same form as x with recoded values

Format

An object of class numeric of length 1.

Details

Input conditions: possible values for left hand side (LHS) of formula or element of from list:
  • vector/single value All values in x which equal to elements of vector in LHS will be replaced with RHS.
  • function Values for which function gives TRUE will be replaced with RHS. There are some special functions for convenience - see criteria.
  • logical vector/matrix/data.frame Values for which LHS equals to TRUE will be recoded. Logical vector will be recycled across all columns of x. If LHS is matrix/data.frame then column from this matrix/data.frame will be used for corresponding column/element of x.
  • . Dot in LHS/from means all other unrecoded values (ELSE in SPSS RECODE). So all other unrecoded values will be changed to RHS of formula or appropriate element of to.

Output values: possible values for right hand side (RHS) of formula or element of to list:

  • value replace elements of x. This value will be recycled across rows and columns of x.
  • vector values of this vector will be replace values in corresponding position in rows of x. Vector will be recycled across columns of x.
  • list/matrix/data.frame Element of list/column of matrix/data.frame will be used as a replacement value for corresponding column/element of x.
  • function This function will be applied to values of x which satisfy recoding condition.
  • . Dot in RHS/to means copy old value (COPY in SPSS RECODE). In most cases there is no need for this option because by default if_val doesn't modify values which don't satisfy any of conditions.

lo and hi are shortcuts for -Inf and Inf. They can be useful in expressions with %thru%, e. g. 1 %thru% hi.

Examples

Run this code
# `ifs` examples
a = 1:5
b = 5:1
ifs(b>3 ~ 1)                     # c(1, 1, NA, NA, NA)
ifs(b>3 ~ 1, default = 3)          # c(1, 1, 3, 3, 3)
ifs(b>3 ~ 1, a>4 ~ 7, default = 3) # c(1, 1, 3, 3, 7)
ifs(b>3 ~ a, default = 42)         # c(1, 2, 42, 42, 42)
# some examples from SPSS manual
# RECODE V1 TO V3 (0=1) (1=0) (2, 3=-1) (9=9) (ELSE=SYSMIS)
set.seed(123)
v1  = sample(c(0:3, 9, 10), 20, replace = TRUE)
if_val(v1) = c(0 ~ 1, 1 ~ 0, 2:3 ~ -1, 9 ~ 9, . ~ NA)
v1

# RECODE QVAR(1 THRU 5=1)(6 THRU 10=2)(11 THRU HI=3)(ELSE=0).
set.seed(123)
qvar = sample((-5):20, 50, replace = TRUE)
if_val(qvar, 1 %thru% 5 ~ 1, 6 %thru% 10 ~ 2, 11 %thru% hi ~ 3, . ~ 0)
# the same result
if_val(qvar, 1 %thru% 5 ~ 1, 6 %thru% 10 ~ 2, gte(11) ~ 3, . ~ 0)

# RECODE STRNGVAR ('A', 'B', 'C'='A')('D', 'E', 'F'='B')(ELSE=' '). 
strngvar = LETTERS
if_val(strngvar, c('A', 'B', 'C') ~ 'A', c('D', 'E', 'F') ~ 'B', . ~ ' ')

# RECODE AGE (MISSING=9) (18 THRU HI=1) (0 THRU 18=0) INTO VOTER. 
set.seed(123)
age = sample(c(sample(5:30, 40, replace = TRUE), rep(9, 10)))
voter = if_val(age, NA ~ 9, 18 %thru% hi ~ 1, 0 %thru% 18 ~ 0)
voter

# example with function in RHS
set.seed(123)
a = rnorm(20)
# if a<(-0.5) we change it to absolute value of a (abs function)
if_val(a, lt(-0.5) ~ abs) 

# the same example with logical criteria
if_val(a, a<(-.5) ~ abs) 

# replace with specific value for each column
# we replace values greater than 0.75 with column max and values less than 0.25 with column min
# and NA with column means
# make data.frame
set.seed(123)
x1 = runif(30)
x2 = runif(30)
x3 = runif(30)
x1[sample(30, 10)] = NA # place 10 NA's
x2[sample(30, 10)] = NA # place 10 NA's
x3[sample(30, 10)] = NA # place 10 NA's
dfs = data.frame(x1, x2, x3)

#replacement. Note the necessary transpose operation
if_val(dfs, lt(0.25) ~ t(min_col(dfs)), gt(0.75) ~ t(max_col(dfs)), NA ~ t(mean_col(dfs)))

# replace NA with row means
# some row which contain all NaN remain unchanged because mean_row for them also is NaN
if_val(dfs, NA ~ mean_row(dfs)) 

# some of the above examples with from/to notation

set.seed(123)
v1  = sample(c(0:3,9,10), 20, replace = TRUE)
# RECODE V1 TO V3 (0=1) (1=0) (2,3=-1) (9=9) (ELSE=SYSMIS)
fr = list(0, 1, 2:3, 9, ".")
to = list(1, 0, -1, 9, NA)
if_val(v1, from = fr) = to
v1

# RECODE QVAR(1 THRU 5=1)(6 THRU 10=2)(11 THRU HI=3)(ELSE=0).
fr = list(1 %thru% 5, 6 %thru% 10, gte(11), ".")
to = list(1, 2, 3, 0)
if_val(qvar, from = fr, to = to)

# RECODE STRNGVAR ('A','B','C'='A')('D','E','F'='B')(ELSE=' ').
fr = list(c('A','B','C'), c('D','E','F') , ".")
to = list("A", "B", " ")
if_val(strngvar, from = fr, to = to)

# RECODE AGE (MISSING=9) (18 THRU HI=1) (0 THRU 18=0) INTO VOTER.
fr = list(NA, 18 %thru% hi, 0 %thru% 18)
to = list(9, 1, 0)
voter = if_val(age, from = fr, to = to)
voter

Run the code above in your browser using DataLab