Last chance! 50% off unlimited learning
Sale ends in
NA
.
set_na(x, ..., value, drop.levels = TRUE, as.tag = FALSE)
x
is
a data frame (and no vector) and only selected variables
from x
should be processed. You may also use functions like
:
or dplyr's select_helpers
.
The latter must be stated as formula (i.e. beginning with ~
).
See 'Examples' or package-vignette.TRUE
, factor levels of values that have
been replaced with NA
are dropped. See 'Examples'.TRUE
, values in x
will be replaced
by tagged_na
, else by usual NA
values. Use a named
vector to assign the value label to the tagged NA value (see 'Examples').x
, with all elements of value
being replaced by NA
.
If x
is a data frame, the complete data frame x
will
be returned, with NA's set for variables specified in ...
;
if ...
is not specified, applies to all variables in the
data frame.
set_na()
converts all values defined in value
with
a related NA
or tagged NA values (see tagged_na
).
Tagged NA
s work exactly like regular R missing values
except that they store one additional byte of information: a tag,
which is usually a letter ("a" to "z") or character number ("0" to "9").
Furthermore, see 'Details' in get_na
.
replace_na
to replace NA
's with specific
values, rec
for general recoding of variables and
recode_to
for re-shifting value ranges. See
get_na
to get values of missing values in
labelled vectors.
# create random variable
dummy <- sample(1:8, 100, replace = TRUE)
# show value distribution
table(dummy)
# set value 1 and 8 as missings
dummy <- set_na(dummy, value = c(1, 8))
# show value distribution, including missings
table(dummy, useNA = "always")
# add named vector as further missing value
set_na(dummy, value = c("Refused" = 5), as.tag = TRUE)
# see different missing types
library(haven)
print_tagged_na(set_na(dummy, value = c("Refused" = 5), as.tag = TRUE))
# create sample data frame
dummy <- data.frame(var1 = sample(1:8, 100, replace = TRUE),
var2 = sample(1:10, 100, replace = TRUE),
var3 = sample(1:6, 100, replace = TRUE))
# set value 2 and 4 as missings
library(dplyr)
dummy %>% set_na(value = c(2, 4)) %>% head()
dummy %>% set_na(value = c(2, 4), as.tag = TRUE) %>% get_na()
dummy %>% set_na(value = c(2, 4), as.tag = TRUE) %>% get_values()
data(efc)
dummy <- data.frame(
var1 = efc$c82cop1,
var2 = efc$c83cop2,
var3 = efc$c84cop3
)
# check original distribution of categories
lapply(dummy, table, useNA = "always")
# set 3 to NA for two variables
lapply(set_na(dummy, var1, var3, value = 3), table, useNA = "always")
# drop unused factor levels when being set to NA
x <- factor(c("a", "b", "c"))
x
set_na(x, value = "b", as.tag = TRUE)
set_na(x, value = "b", drop.levels = FALSE, as.tag = TRUE)
# set_na() can also remove a missing by defining the value label
# of the value that should be replaced with NA. This is in particular
# helpful if a certain category should be set as NA, however, this category
# is assigned with different values accross variables
x1 <- sample(1:4, 20, replace = TRUE)
x2 <- sample(1:7, 20, replace = TRUE)
x1 <- set_labels(x1, labels = c("Refused" = 3, "No answer" = 4))
x2 <- set_labels(x2, labels = c("Refused" = 6, "No answer" = 7))
tmp <- data.frame(x1, x2)
get_labels(tmp)
get_labels(set_na(tmp, value = "No answer"))
get_labels(set_na(tmp, value = c("Refused", "No answer")))
# show values
tmp
set_na(tmp, value = c("Refused", "No answer"))
Run the code above in your browser using DataLab