Scoped helper verbs included in this R Documentation file allow for targeted commands on specified columns. They also rename the ensuing output to conform to my preferred style. The commands here are multiple and explained in the details section below.
center_at(data, x, prefix = "c", na = TRUE, .by = NULL)diff_at(data, x, o = 1, prefix = "d", .by = NULL)
group_mean_center_at(
data,
x,
mean_prefix = "mean",
prefix = "b",
na = TRUE,
.by
)
lag_at(data, x, prefix = "l", o = 1, .by = NULL)
log_at(data, x, prefix = "ln", plus_1 = FALSE)
mean_at(data, x, prefix = "mean", na = TRUE, .by = NULL)
r1sd_at(data, x, prefix = "s", na = TRUE, .by = NULL)
r2sd_at(data, x, prefix = "z", na = TRUE, .by = NULL)
The function returns a set of new vectors in a data frame after performing relevant functions. The new vectors have distinct prefixes corresponding with the action performed on them.
a data frame
a vector, likely in your data frame
Allows the user to rename the prefix of the new variables. Each function has defaults (see details section).
a logical about whether missing values should be ignored in the
creation of means and re-scaled variables. Defaults to TRUE (i.e. pass
over/remove missing observations). Not applicable to diff_at
,
lag_at
, and log_at
.
a selection of columns by which to group the operation. Defaults
to NULL. This will eventually become a standard feature of the functions
as this operator moves beyond the experimental in dplyr. The argument
is not applicable to log_at
(why would it be) and is optional for all
functions except group_mean_center_at
. group_mean_center_at
must have something specified for grouped mean-centering.
The order of lags for calculating differences or lags in
diff_at
or lag_at
. Applicable only to these functions.
Applicable only to group_mean_center_at
. Specifies
the prefix of the (assumed) total population mean variables. Default is "mean",
though the user can change this as they see fit.
Applicable only to log_at
. If TRUE, adds 1 to the
variables prior to log transformation. If FALSE, performs logarithmic
transformation on variables no matter whether 0 occurs (i.e. 0s will
come back as -Inf). Defaults to FALSE.
center_at
is a wrapper for mutate_at
and rename_at
from
dplyr. It takes supplied vectors and effectively centers them from the
mean. It then renames these new variables to have a prefix of c_
. The
default prefix ("c") can be changed by way of an argument in the function.
diff_at
is a wrapper for mutate
and across
from
dplyr. It takes supplied vectors and creates differences from the
previous value recorded above it. It then renames these new variables to have
a prefix of d_
(in the case of a first difference), or something like
d2_
in the case of second differences, or d3_
in the case of
third differences (and so on). The exact prefix depends on the o
argument, which communicates the order of lags you want. It defaults to 1. The
default prefix ("d") can be changed by way of an argument in the function,
though the naming convention will omit a numerical prefix for first
differences.
group_mean_center_at
is a wrapper for mutate
and across
in dplyr. It takes supplied vectors and centers an (assumed) group mean
of the variables from an (assumed) total population mean of the variables
provided to it. It then returns the new variables with a prefix, whose default
is b_
. This prefix communicates, if you will, a kind of "between"
variable in the panel model context, in juxtaposition to "within" variables
in the panel model context.
lag_at
is a wrapper for mutate
and across
from
dplyr. It takes supplied vector(s) and creates lag variables from them.
These new variables have a prefix of l[o]_
where o
corresponds
to the order of the lag (specified by an argument in the function, which
defaults to 1). This default prefix ("l") can be changed by way of an
another argument in the function.
log_at
is a wrapper for mutate
and across
from
dplyr. It takes supplied vectors and creates a variable that takes
a natural logarithmic transformation of them. It then renames these new
variables to have a prefix of ln_
. This default prefix ("ln") can be
changed by way of an argument in the function. Users can optionally specify
that they want to add 1 to the vector before taking its natural logarithm,
which is a popular thing to do when positive reals have naturally occurring
zeroes.
mean_at
is a wrapper for mutate
and across
from
dplyr. It takes supplied vectors and creates a variable communicating
the mean of the variable. It then renames these new variables to have a
prefix of mean_
. This default prefix ("mean") can be changed by way of
an argument in the function.
r1sd_at
is a wrapper for mutate
and across
from
dplyr. It both rescales the supplied vectors to new vectors and renames
the vectors to each have a prefix of s_
. Note the rescaling here is
just by one standard deviation and not two. The default prefix ("s") can be
changed by way of an argument in the function.
r2sd_at
is a wrapper for mutate
and across
from
dplyr. It both rescales the supplied vectors to new vectors and renames
the vectors to each have a prefix of z_
. Note the rescaling here is by
two standard deviations and not one. The default prefix ("z") can be
changed by way of an argument in the function.
All functions, except for lag_at
, will fail in the absence of a
character vector of a length of one. They are intended to work across multiple
columns instead of just one. If you are wanting to create one new variable,
you should think about using some other dplyr verb on its own.
set.seed(8675309)
Example <- data.frame(category = c(rep("A", 5),
rep("B", 5),
rep("C", 5)),
x = runif(15), y = runif(15),
z = sample(1:20, 15, replace=TRUE))
my_vars <- c("x", "y", "z")
center_at(Example, my_vars)
diff_at(Example, my_vars)
diff_at(Example, my_vars, o=3)
lag_at(Example, my_vars)
lag_at(Example, my_vars, o=3)
log_at(Example, my_vars)
log_at(Example, my_vars, plus_1 = TRUE)
mean_at(Example, my_vars)
r1sd_at(Example, my_vars)
r2sd_at(Example, my_vars)
Run the code above in your browser using DataLab