Extracts a state from a dataset and provides their start and end times, as well as duration and epoch. The state does not have to exist in the dataset, but can be dynamically created. Extracted states can have group-dropping disabled, meaning that summaries based on the extracted states show empty groups as well.
extract_states(
data,
State.colname,
State.expression = NULL,
Datetime.colname = Datetime,
handle.gaps = FALSE,
epoch = "dominant.epoch",
drop.empty.groups = TRUE,
group.by.state = TRUE
)
a dataframe with one row per state instance. Each row will consist of the original dataset grouping, the state column. A state.count column, start and end Datetimes, as well as a duration of the state
A light logger dataset. Expects a dataframe.
The variable or condition to be evaluated for state
exctration. Expects a symbol. If it is not part of the data, a
State.expression
is required.
If State.colname
is not part of the data
, this
expression will be evaluated to generate the state. The result of this
expression will be used for grouping, so it is recommended to be
factor-like. If State.colname
is part of the data
, this argument will be ignored
Column name that contains the datetime. Defaults to "Datetime" which is automatically correct for data imported with LightLogR. Expects a symbol.
Logical whether the data shall be treated with
gap_handler()
. Is set to FALSE
by default, due to computational costs.
The epoch to use for the gapless sequence. Can be either a
lubridate::duration()
or a string. If it is a string, it needs to be
either '"dominant.epoch"' (the default) for a guess based on the data or a
valid lubridate::duration()
string, e.g., "1 day"
or "10 sec"
.
Logical. Should empty groups be dropped? Only works
if .drop = FALSE
has not been used with the current grouping prior to
calling the function. Default to TRUE
. If set to FALSE
can lead to an
error if factors are present in the grouping that have more levels than
actual data. Can, however, be useful and necessary when summarizing the
groups further, e.g. through summarize_numeric()
- having an empty group
present is important when averaging numbers.
Logical. Should the output be automatically be grouped by the new state?
#summarizing states "photoperiod"
states <-
sample.data.environment |>
add_photoperiod(c(48.52, 9.06)) |>
extract_states(photoperiod.state)
states |> head(2)
states |> tail(2)
states |> summarize_numeric(c("state.count", "epoch"))
Run the code above in your browser using DataLab