Learn R Programming

padr (version 0.1.0)

pad: Pad the datetime column of a data frame.

Description

pad will fill the gaps in incomplete datetime variables, by figuring out what the interval of the data is and what instances are missing. It will insert a record for each of the missing time points. For all other variables in the data frame a missing value will be insterted at the padded rows.

Usage

pad(x, interval = NULL, start_val = NULL, end_val = NULL, by = NULL)

Arguments

x
A data frame containing at least one variable of class Date, class POSIXct or class POSIXlt.
interval
The interval of the returned datetime variable. When NULL the the interval will be equal to the interval of the datetime variable. When specified it can only be lower than the interval of the input data. See Details.
start_val
An object of class Date, class POSIXct or class POSIXlt that specifies the start of the returned datetime variable. If NULL it will use the lowest value of the input variable.
end_val
An object of class Date, class POSIXct or class POSIXlt that specifies the end of returned datetime variable. If NULL it will use the highest value of the input variable.
by
Only needs to be specified when x contains multiple variables of class Date, class POSIXct or class POSIXlt. by indicates which variable to use for padding.

Value

The data frame x with the datetime variable padded. All other variables in the data frame will have missing values at the rows that are padded.

Details

The interval of a datetime variable is the time unit at which the observations occur. The eight intervals in padr are from high to low year, quarter, month, week, day, hour, min, and sec. pad will figure out the interval of the input variable and will fill the gaps for the instances that would be expected from the interval, but are missing in the input data. See vignette("padr") for more information on pad. See vignette("padr_implementation") for detailed information on daylight savings time, different timezones, and the implementation of thicken.

Examples

Run this code
simple_df <- data.frame(day = as.Date(c('2016-04-01', '2016-04-03')),
                        some_value = c(3,4))
pad(simple_df)

library(dplyr) # for the pipe operator
month <- seq(as.Date('2016-04-01'), as.Date('2017-04-01'),
              by = 'month')[c(1, 4, 5, 7, 9, 10, 13)]
month_df <- data.frame(month = month,
                       y = runif(length(month), 10, 20) %>% round)
# forward fill the padded values with tidyr's fill
month_df %>% pad %>% tidyr::fill(y)

# or fill all y with 0
month_df %>% pad %>% fill_by_value(y)

# padding a data.frame on group level
day_var <- seq(as.Date('2016-01-01'), length.out = 12, by = 'month')
x_df_grp <- data.frame(grp  = rep(LETTERS[1:3], each =4),
                       y    = runif(12, 10, 20) %>% round(0),
                       date = sample(day_var, 12, TRUE)) %>%
 arrange(grp, date)

x_df_grp %>% group_by(grp) %>% do(pad(.)) %>% ungroup %>%
tidyr::fill(grp)

Run the code above in your browser using DataLab