Learn R Programming

dateutils (version 0.1.5)

process_wide: Process Wide Format Data

Description

Process data in wide format for time series modeling

Usage

process_wide(
  dt_wide,
  lib,
  detrend = TRUE,
  center = TRUE,
  scale = TRUE,
  date_name = "ref_date",
  ignore_numeric_names = TRUE,
  silent = FALSE
)

Arguments

dt_wide

Data in wide format.

lib

Library with instructions regarding how to process data; see details.

detrend

T/F should data be detrended (see details)?

center

T/F should data be centered (i.e. de-meaned)?

scale

T/F should data be scaled (i.e. variance 1)?

date_name

Name of data column in the data.

ignore_numeric_names

T/F ignore numeric values in matching series names in `dt` to series names in `lib`. This is required for data aggregated using `process_MF()`, as lags of LHS and RHS data are tagged 0 for contemporaneous data, 1 for one lag, 2 for 2 lags, etc. Ignoring these tags insures processing from `lib` is correctly identified.

silent

T/F, supress warnings?

Value

data.table of processed data

Details

`process_wide()` can be used to transform wide data to insure stationarity. Censoring by pub_date requires long format. Directions for processing each file come from the data.table `lib`. This table must include the columns `series_name`, `take_logs`, and `take_diffs`. Unique series may also be identified by a combination of `country` and `series_name`. Optional columns include `needs_SA` for series that need seasonal adjustment, `detrend` for removing low frequency trends (nowcasting only; `detrend` should not be used for long horizon forecasts), `center` to de-mean the data, and `scale` to scale the data. If the argument to `process_wide()` of `detrend`, `center`, or `scale` is `FALSE`, the operation will not be performed. If `TRUE`, the function will check for the column of the same name in `lib`. If the column exists, T/F entries from this column are used to determine which series to transform. If the column does not exist, all series will be transformed.

Examples

Run this code
# NOT RUN {
LHS <- fred[series_name == "gdp constant prices"]
RHS <- fred[series_name != "gdp constant prices"]
dtQ <- process_MF(LHS, RHS)
dt_wide <- data.table::dcast(dtQ, ref_date ~ series_name, value.var = "value")
dt_processed <- process_wide(dt_wide, fredlib)
# }

Run the code above in your browser using DataLab