Learn R Programming

padr (version 0.1.0)

thicken: Add a variable of a higher interval to a data frame.

Description

thicken will take the datetime variable in a data frame and map this to a variable of a higher interval. The mapping is added to the data frame in a new variable. After applying thicken the user can aggregate the other variables in the data frame to the higher interval, for instance using dplyr.

Usage

thicken(x, interval = c("level_up", "year", "quarter", "month", "week", "day", "hour", "min"), colname = NULL, rounding = c("down", "up"), by = NULL, start_val = NULL)

Arguments

x
A data frame containing at least one datetime variable of class Date, class POSIXct or class POSIXlt.
interval
The interval of the added datetime variable, which should be higher than the interval of the input datetime variable. If NULL it will be one level higher than the interval of the input datetime variable.
colname
The column name of the added variable. If NULL it will be the name of the original datetime variable with the interval name added to it, separeted by an underscore.
rounding
Should a value in the input datetime variable be mapped to the closest value that is lower (down) or that is higher (up) than itself.
by
Only needs to be specified when x contains multiple variables of class Date, class POSIXct or class POSIXlt. by indicates which to use.
start_val
By default the first instance of interval that is lower than the lowest value of the input datetime variable, with all time units on default value. Specify start_val as an offset if you want the range to be nonstandard.

Value

The data frame x with the variable added to it.

Details

See vignette("padr") for more information on thicken. See vignette("padr_implementation") for detailed information on daylight savings time, different timezones, and the implementation of thicken.

Examples

Run this code
x_hour <- seq(lubridate::ymd_hms('20160302 000000'), by = 'hour',
              length.out = 200)
some_df <- data.frame(x_hour = x_hour)
thicken(some_df)
thicken(some_df, 'month')
thicken(some_df, start_val = lubridate::ymd_hms('20160301 120000'))

library(dplyr)
x_df <- data.frame(
  x = seq(lubridate::ymd(20130101), by = 'day', length.out = 1000) %>%
    sample(500),
  y = runif(500, 10, 50) %>% round) %>%
  arrange(x)

# get the max per month
x_df %>% thicken('month') %>% group_by(x_month) %>%
  summarise(y_max = max(y))

# get the average per week, but you want your week to start on Mondays
# instead of Sundays
min_x <- x_df$x %>% min
weekdays(min_x)
x_df %>% thicken(start_val = min_x - 1) %>%
  group_by(x_week) %>% summarise(y_avg = mean(y))

Run the code above in your browser using DataLab