This is a workhorse function used by impute_ndd,
impute_qty and others.
impute(
data,
variable,
method = c("ignore", "mean", "median", "mode", "replace", "min", "max", "sum"),
where = is.na,
group,
...,
replace_with = NA_real_
)A data frame containing columns prodcode, pracid, patid
Unquoted name of the column in dataset to be imputed
Method for imputing the values. See details.
Logical vector, or function applied to variable returning such a vector, indicating which elements to impute. Defaults to is.na
Level of structure for imputation. Defaults to whole study population.
Extra arguments, currently ignored
if the method 'replace' is selected, which value should be inserted?
ignore. Do nothing, leaving input unchanged.
mean. Replace values with the mean by group
median. Replace values with the median by group
mode. Replace values with the most common value by group
replace. Replace values with replace_with, which defaults to NA (i.e. mark as missing)
min. Replace with minimum value.
max. Replace with maximum value.
sum. Replace with sum of values.
A data frame of the same structure as data, with values imputed
The argument where indicates which values are to be imputed.
It can be specified as either a vector or as a function. Thus you can
specify, for example, is.na to impute all missing values, or
you can pass in a vector, if it depends on something else rather than just
the current values of the variable to imputed.
This design may change in future. In particular, if we want to impute
implausible values and impute missing values separately, it's important that
these steps are independent.