
recode
change, rearrange or consolidate the values of an existing
variable based on conditions. Design of this function inspired by RECODE from
SPSS. Sequence of recodings provided in the form of formulas. For example,
1:2 ~ 1 means that all 1's and 2's will be replaced with 1. Each value will
recoded only once. In the assignment form recode(...) = ...
of this
function values which doesn't meet any condition remain unchanged. In case of
the usual form ... = recode(...)
values which doesn't meet any
condition will be replaced with NA. One can use values or more sophisticated
logical conditions and functions as a condition. There are several special
functions for usage as criteria - for details see criteria. Simple
common usage looks like: recode(x, 1:2 ~ -1, 3 ~ 0, 1:2 ~ 1, 99 ~ NA)
.
For more information, see details and examples.
The ifs
function checks whether one or more conditions are met and
returns a value that corresponds to the first TRUE condition. ifs
can
take the place of multiple nested ifelse
statements and is much
easier to read with multiple conditions. ifs
works in the same manner
as recode
- e. g. with formulas or with from/to notation. But conditions
should be only logical and it doesn't operate on multicolumn objects.
if_val(x, ..., from = NULL, to = NULL)if_val(x, from = NULL) <- value
recode(x, from = NULL) <- value
recode(x, ..., from = NULL, to = NULL)
ifs(..., from = NULL, to = NULL)
lo
hi
copy(x)
values %into% names
vector/matrix/data.frame/list
sequence of formulas which describe recodings. They are used when
from
/to
arguments are not provided.
list of conditions for values which should be recoded (in the same format as LHS of formulas).
list of values into which old values should be recoded (in the same format as RHS of formulas).
list with formulas which describe recodings in assignment form
of function/to
list if from
/to
notation is used.
object(-s) which will be assigned to names
for
%into%
operation. %into%
supports multivalue assignments.
See examples.
name(-s) which will be given to values
expression. For
%into%
.
object of same form as x
with recoded values
An object of class numeric
of length 1.
Input conditions - possible values for left-hand side (LHS) of formula or
element of from
list:
vector/single value All values in x
which equal to elements of
vector in LHS will be replaced with RHS.
function Values for which function gives TRUE will be replaced with
RHS. There are some special functions for the convenience - see criteria.
One of special functions is other
. It means all other unrecoded values
(ELSE in SPSS RECODE). All other unrecoded values will be changed to RHS
of formula or appropriate element of to
.
logical vector/matrix/data.frame Values for which LHS equals to TRUE
will be recoded. Logical vector will be recycled across all columns of
x
. If LHS is matrix/data.frame then column from this matrix/data.frame
will be used for corresponding column/element of x
.
Output values - possible values for right-hand side (RHS) of formula or
element of to
list:
value replace elements of x
. This value will be
recycled across rows and columns of x
.
vector values of this vector will be replace values in corresponding
position in rows of x
. Vector will be recycled across columns of
x
.
list/matrix/data.frame Element of list/column of matrix/data.frame
will be used as a replacement value for corresponding column/element of
x
.
function This function will be applied to values of x
which
satisfy recoding condition.There is special auxiliary function copy
which just returns its argument. So in the recode
it just copies old
value (COPY in SPSS RECODE). See examples. copy
is useful in the
usual form of recode
and doesn't do anything in the case of the
assignment form recode() = ...
because this form don't modify values
which are not satisfying any of the conditions.
%into%
tries to mimic SPSS 'INTO'. Values from left-hand side will
be assigned to right-hand side. You can use %to%
expression in the
RHS of %into%
. See examples.
lo
and hi
are shortcuts for -Inf
and Inf
. They
can be useful in expressions with %thru%
, e. g. 1 %thru%
hi
. if_val
is an alias for recode
.
# NOT RUN {
# `ifs` examples
a = 1:5
b = 5:1
ifs(b>3 ~ 1) # c(1, 1, NA, NA, NA)
ifs(b>3 ~ 1, TRUE ~ 3) # c(1, 1, 3, 3, 3)
ifs(b>3 ~ 1, a>4 ~ 7, TRUE ~ 3) # c(1, 1, 3, 3, 7)
ifs(b>3 ~ a, TRUE ~ 42) # c(1, 2, 42, 42, 42)
# some examples from SPSS manual
# RECODE V1 TO V3 (0=1) (1=0) (2, 3=-1) (9=9) (ELSE=SYSMIS)
set.seed(123)
v1 = sample(c(0:3, 9, 10), 20, replace = TRUE)
recode(v1) = c(0 ~ 1, 1 ~ 0, 2:3 ~ -1, 9 ~ 9, other ~ NA)
v1
# RECODE QVAR(1 THRU 5=1)(6 THRU 10=2)(11 THRU HI=3)(ELSE=0).
set.seed(123)
qvar = sample((-5):20, 50, replace = TRUE)
recode(qvar, 1 %thru% 5 ~ 1, 6 %thru% 10 ~ 2, 11 %thru% hi ~ 3, other ~ 0)
# the same result
recode(qvar, 1 %thru% 5 ~ 1, 6 %thru% 10 ~ 2, ge(11) ~ 3, other ~ 0)
# RECODE STRNGVAR ('A', 'B', 'C'='A')('D', 'E', 'F'='B')(ELSE=' ').
strngvar = LETTERS
recode(strngvar, c('A', 'B', 'C') ~ 'A', c('D', 'E', 'F') ~ 'B', other ~ ' ')
# RECODE AGE (MISSING=9) (18 THRU HI=1) (0 THRU 18=0) INTO VOTER.
set.seed(123)
age = sample(c(sample(5:30, 40, replace = TRUE), rep(9, 10)))
voter = recode(age, NA ~ 9, 18 %thru% hi ~ 1, 0 %thru% 18 ~ 0)
voter
# the same result with '%into%'
recode(age, NA ~ 9, 18 %thru% hi ~ 1, 0 %thru% 18 ~ 0) %into% voter2
voter2
# multiple assignment with '%into%'
#' set.seed(123)
x1 = runif(30)
x2 = runif(30)
x3 = runif(30)
# note nessesary brackets around RHS of '%into%'
recode(x1 %to% x3, gt(0.5) ~ 1, other ~ 0) %into% (x_rec_1 %to% x_rec_3)
fre(x_rec_1)
# the same operation with characters expansion
i = 1:3
recode(x1 %to% x3, gt(0.5) ~ 1, other ~ 0) %into% subst('x_rec2_`i`')
fre(x_rec2_1)
# example with function in RHS
set.seed(123)
a = rnorm(20)
# if a<(-0.5) we change it to absolute value of a (abs function)
recode(a, lt(-0.5) ~ abs, other ~ copy)
# the same example with logical criteria
recode(a, a<(-.5) ~ abs, other ~ copy)
# replace with specific value for each column
# we replace values greater than 0.75 with column max and values less than 0.25 with column min
# and NA with column means
# make data.frame
set.seed(123)
x1 = runif(30)
x2 = runif(30)
x3 = runif(30)
x1[sample(30, 10)] = NA # place 10 NA's
x2[sample(30, 10)] = NA # place 10 NA's
x3[sample(30, 10)] = NA # place 10 NA's
dfs = data.frame(x1, x2, x3)
#replacement. Note the necessary transpose operation
recode(dfs,
lt(0.25) ~ t(min_col(dfs)),
gt(0.75) ~ t(max_col(dfs)),
NA ~ t(mean_col(dfs)),
other ~ copy
)
# replace NA with row means
# some rows which contain only NaN's remain unchanged because mean_row for them also is NaN
recode(dfs, NA ~ mean_row(dfs), other ~ copy)
# some of the above examples with from/to notation
set.seed(123)
v1 = sample(c(0:3,9,10), 20, replace = TRUE)
# RECODE V1 TO V3 (0=1) (1=0) (2,3=-1) (9=9) (ELSE=SYSMIS)
fr = list(0, 1, 2:3, 9, other)
to = list(1, 0, -1, 9, NA)
recode(v1, from = fr) = to
v1
# RECODE QVAR(1 THRU 5=1)(6 THRU 10=2)(11 THRU HI=3)(ELSE=0).
fr = list(1 %thru% 5, 6 %thru% 10, ge(11), other)
to = list(1, 2, 3, 0)
recode(qvar, from = fr, to = to)
# RECODE STRNGVAR ('A','B','C'='A')('D','E','F'='B')(ELSE=' ').
fr = list(c('A','B','C'), c('D','E','F') , other)
to = list("A", "B", " ")
recode(strngvar, from = fr, to = to)
# RECODE AGE (MISSING=9) (18 THRU HI=1) (0 THRU 18=0) INTO VOTER.
fr = list(NA, 18 %thru% hi, 0 %thru% 18)
to = list(9, 1, 0)
voter = recode(age, from = fr, to = to)
voter
# }
Run the code above in your browser using DataLab