This function computes lagged values of variables by a specified number of observations. By default, the function returns lag-1 values of the vector or data frame specified in the first argument.
lagged(data, ..., id = NULL, obs = NULL, day = NULL, lag = 1, time = NULL,
units = c("secs", "mins", "hours", "days", "weeks"), append = TRUE,
name = ".lag", name.td = ".td", as.na = NULL, check = TRUE)
Returns a numeric vector or data frame with the same length or same number of
rows as data
containing the lagged variable(s).
a numeric vector for computing a lagged values for a variable
or data frame for computing lagged values for more than one
variable. Note that the subject ID variable (id
),
observation number variable (obs
), day number variable
(day
), and the date and time variable (time
) are
excluded from data
when specifying theses arguments.
an expression indicating the variable names in data
.
Note that the operators .
, +
, -
, ~
,
:
, ::
, and !
can also be used to select
variables, see 'Details' in the df.subset
function.
either a character string indicating the variable name of the subject ID variable or a vector representing the subject IDs, see 'Details'.
either a character string indicating the variable name of the observation number variable or a vector representing the observations. Note that duplicated values within the same subject ID are not allowed, see 'Details'.
either a character string indicating the variable name of the day number variable in or a vector representing the days, see 'Details'.
a numeric value specifying the lag, e.g. lag = 1
(default)
returns lag-1 values.
a variable of class POSIXct
or POSIXlt
representing
the date and time of the observation used to compute time
differences between observations.
a character string indicating the units in which the time
difference is represented, i.e., "secs"
for seconds,
"mins"
(default) for minutes, "hours"
for hours,
"days"
for days, and "weeks"
for weeks.
logical: if TRUE
(default), lagged variable(s) are
appended to the data frame specified in the argument data
.
a character string or character vector indicating the names of
the lagged variables. By default, lagged variables are named
with the ending ".lag"
resulting in e.g. "x1.lag"
and "x2.lag"
when specifying two variables. Variable
names can also be specified using a character vector matching
the number of variables, e.g.,
name = c("lag.x1", "lag.x2")
).
a character string or character vector indicating the names of
the time difference variables when specifying a date and time
variables for the argument time
. By default, time
difference variables are named with the ending ".td"
resulting in e.g. "x1.td"
and "x2.td"
when
specifying two variables. Variable names can also be specified
using a character vector matching the number of variables
specified, e.g., name = c("td.x1", "td.x2")
).
a numeric vector indicating user-defined missing values, i.e.
these values are converted to NA
before conducting the
analysis. Note that as.na()
function is only applied to
the argument data
, but not to cluster
.
logical: if TRUE
(default), argument specification is
checked.
Takuya Yanagida takuya.yanagida@univie.ac.at
id
If the id
argument is not specified
i.e., id = NULL
, all observations are assumed to come from the same
subject. If the dataset includes multiple subjects, then this variable needs
to be specified so that observations are not lagged across subjects
day
If the day
argument is not specified
i.e., day = NULL
, values of the variable to be lagged are allowed to be
lagged across days in case there are multiple observation days.
obs
If the obs
argument is not specified
i.e., obs = NULL
, consecutive observations from the same subjects are
assumed to be one lag apart.
Viechtbauer W, Constantin M (2023). esmpack: Functions that facilitate preparation and management of ESM/EMA data. R package version 0.1-20.
center
, rec
, coding
, item.reverse
.
dat <- data.frame(subject = rep(1:2, each = 6),
day = rep(1:2, each = 3),
obs = rep(1:6, times = 2),
time = as.POSIXct(c("2024-01-01 09:01:00", "2024-01-01 12:05:00",
"2024-01-01 15:14:00", "2024-01-02 09:03:00",
"2024-01-02 12:21:00", "2024-01-02 15:03:00",
"2024-01-01 09:02:00", "2024-01-01 12:09:00",
"2024-01-01 15:06:00", "2024-01-02 09:02:00",
"2024-01-02 12:15:00", "2024-01-02 15:06:00")),
pos = c(6, 7, 5, 8, NA, 7, 4, NA, 5, 4, 5, 3),
neg = c(2, 3, 2, 5, 3, 4, 6, 4, 6, 4, NA, 8))
# Example 1: Lagged variable for 'pos'
lagged(dat$pos, id = dat$subject, day = dat$day)
# Example 1b: Alternative specification without using the '...' argument
lagged(dat[, c("pos", "subject", "day")], id = "subject", day = "day")
# Example 1c: Alternative specification using the 'data' argument
lagged(pos, data = dat, id = "subject", day = "day")
# Example 2a: Lagged variable for 'pos' and 'neg'
lagged(dat[, c("pos", "neg")], id = dat$subject, day = dat$day)
# Example 2b: Alternative specification using the 'data' argument
lagged(pos, neg, data = dat, id = "subject", day = "day")
# Example 3: Lag-2 variables for 'pos' and 'neg'
lagged(pos, neg, data = dat, id = "subject", day = "day", lag = 2)
# Example 4: Lagged variable and time difference variable
lagged(pos, neg, data = dat, id = "subject", day = "day", time = "time")
# Example 5: Lagged variables and time difference variables,
# name variables
lagged(pos, neg, data = dat, id = "subject", day = "day", time = "time",
name = c("p.lag1", "n.lag1"), name.td = c("p.diff", "n.diff"))
# Example 6: NA observations excluded from the data frame
dat.excl <- dat[!is.na(dat$pos), ]
# Number of observation not taken into account, i.e.,
# - observation 4 used as lagged value for observation 6 for subject 1
# - observation 1 used as lagged value for observation 3 for subject 2
lagged(pos, data = dat.excl, id = "subject", day = "day")
# Number of observation taken into account by specifying the 'ob' argument
lagged(pos, data = dat.excl, id = "subject", day = "day", obs = "obs")
Run the code above in your browser using DataLab