Last chance! 50% off unlimited learning
Sale ends in
if_val
change, rearrange or consolidate the values of an existing
variable based on conditions. Design of this function inspired by RECODE from
SPSS. Sequence of recodings provided in the form of formulas. For example,
1:2 ~ 1 means that all 1 and 2 will be replaced with 1. Each value recoded
only once. Values which doesn't meet any condition remain unchanged. As a
condition one can use just values or more sophisticated logical values and
functions. There are several special functions for usage as criteria - for
details see criteria. Simple common usage looks like: if_val(x,
1:2 ~ -1, 3 ~ 0, 1:2 ~ 1, 99 ~ NA)
. For more information, see details and
examples.
The ifs
function checks whether one or more conditions are met and
returns a value that corresponds to the first TRUE condition. ifs
can
take the place of multiple nested ifelse
statements, and is much
easier to read with multiple conditions. ifs
works in the same manner
as if_val
- e. g. with formula or from/to notation. But conditions
should be only logical and it doesn't operate on multicolumn objects.
if_val(x, ..., from = NULL, to = NULL)
if_val(x, from = NULL) <- value
ifs(..., from = NULL, to = NULL, default = NA)
lo
hi
from
/to
arguments are not provided.to
list if from
/to
notation is used.x
with recoded values
numeric
of length 1.from
list:
x
which equal to elements of vector in LHS will be replaced with RHS.
x
. If LHS is matrix/data.frame then column from this matrix/data.frame
will be used for corresponding column/element of x
.
from
means all other unrecoded values (ELSE in SPSS RECODE). So all
other unrecoded values will be changed to RHS of formula or appropriate
element of to
.
Output values: possible values for right hand side (RHS) of formula or element of to
list:
x
. This value will be
recycled across rows and columns of x
.
x
. Vector will be recycled across columns of
x
.
x
.
x
which satisfy recoding condition.
to
means copy old value (COPY in SPSS RECODE).
In most cases there is no need for this option because by default
if_val
doesn't modify values which don't satisfy any of conditions.
lo
and hi
are shortcuts for -Inf
and Inf
. They
can be useful in expressions with %thru%
, e. g. 1 %thru% hi
.
# `ifs` examples
a = 1:5
b = 5:1
ifs(b>3 ~ 1) # c(1, 1, NA, NA, NA)
ifs(b>3 ~ 1, default = 3) # c(1, 1, 3, 3, 3)
ifs(b>3 ~ 1, a>4 ~ 7, default = 3) # c(1, 1, 3, 3, 7)
ifs(b>3 ~ a, default = 42) # c(1, 2, 42, 42, 42)
# some examples from SPSS manual
# RECODE V1 TO V3 (0=1) (1=0) (2, 3=-1) (9=9) (ELSE=SYSMIS)
set.seed(123)
v1 = sample(c(0:3, 9, 10), 20, replace = TRUE)
if_val(v1) = c(0 ~ 1, 1 ~ 0, 2:3 ~ -1, 9 ~ 9, . ~ NA)
v1
# RECODE QVAR(1 THRU 5=1)(6 THRU 10=2)(11 THRU HI=3)(ELSE=0).
set.seed(123)
qvar = sample((-5):20, 50, replace = TRUE)
if_val(qvar, 1 %thru% 5 ~ 1, 6 %thru% 10 ~ 2, 11 %thru% hi ~ 3, . ~ 0)
# the same result
if_val(qvar, 1 %thru% 5 ~ 1, 6 %thru% 10 ~ 2, gte(11) ~ 3, . ~ 0)
# RECODE STRNGVAR ('A', 'B', 'C'='A')('D', 'E', 'F'='B')(ELSE=' ').
strngvar = LETTERS
if_val(strngvar, c('A', 'B', 'C') ~ 'A', c('D', 'E', 'F') ~ 'B', . ~ ' ')
# RECODE AGE (MISSING=9) (18 THRU HI=1) (0 THRU 18=0) INTO VOTER.
set.seed(123)
age = sample(c(sample(5:30, 40, replace = TRUE), rep(9, 10)))
voter = if_val(age, NA ~ 9, 18 %thru% hi ~ 1, 0 %thru% 18 ~ 0)
voter
# example with function in RHS
set.seed(123)
a = rnorm(20)
# if a<(-0.5) we change it to absolute value of a (abs function)
if_val(a, lt(-0.5) ~ abs)
# the same example with logical criteria
if_val(a, a<(-.5) ~ abs)
# replace with specific value for each column
# we replace values greater than 0.75 with column max and values less than 0.25 with column min
# and NA with column means
# make data.frame
set.seed(123)
x1 = runif(30)
x2 = runif(30)
x3 = runif(30)
x1[sample(30, 10)] = NA # place 10 NA's
x2[sample(30, 10)] = NA # place 10 NA's
x3[sample(30, 10)] = NA # place 10 NA's
dfs = data.frame(x1, x2, x3)
#replacement. Note the necessary transpose operation
if_val(dfs, lt(0.25) ~ t(min_col(dfs)), gt(0.75) ~ t(max_col(dfs)), NA ~ t(mean_col(dfs)))
# replace NA with row means
# some row which contain all NaN remain unchanged because mean_row for them also is NaN
if_val(dfs, NA ~ mean_row(dfs))
# some of the above examples with from/to notation
set.seed(123)
v1 = sample(c(0:3,9,10), 20, replace = TRUE)
# RECODE V1 TO V3 (0=1) (1=0) (2,3=-1) (9=9) (ELSE=SYSMIS)
fr = list(0, 1, 2:3, 9, ".")
to = list(1, 0, -1, 9, NA)
if_val(v1, from = fr) = to
v1
# RECODE QVAR(1 THRU 5=1)(6 THRU 10=2)(11 THRU HI=3)(ELSE=0).
fr = list(1 %thru% 5, 6 %thru% 10, gte(11), ".")
to = list(1, 2, 3, 0)
if_val(qvar, from = fr, to = to)
# RECODE STRNGVAR ('A','B','C'='A')('D','E','F'='B')(ELSE=' ').
fr = list(c('A','B','C'), c('D','E','F') , ".")
to = list("A", "B", " ")
if_val(strngvar, from = fr, to = to)
# RECODE AGE (MISSING=9) (18 THRU HI=1) (0 THRU 18=0) INTO VOTER.
fr = list(NA, 18 %thru% hi, 0 %thru% 18)
to = list(9, 1, 0)
voter = if_val(age, from = fr, to = to)
voter
Run the code above in your browser using DataLab