rdf_aggregate: Aggregate RiverWare output for one or more scenarios

Description

Process the user specified rwd_agg object for one or more scenarios to aggregate and summarize RiverWare output data.

Usage

rdf_aggregate(agg, rdf_dir = ".", scenario = NULL, keep_cols = FALSE,
  nans_are = "0", find_all_slots = TRUE, cpp = TRUE, verbose = TRUE)
rw_scen_aggregate(scenarios, agg, scen_dir = ".", nans_are = "0",
  keep_cols = FALSE, file = NULL, scen_names = NULL,
  find_all_slots = TRUE, cpp = TRUE, verbose = TRUE)

Arguments

agg

A rwd_agg object specifying the rdfs, slots, and aggregation methods to use.

rdf_dir

The top level directory that contains the rdf files. See Directory Structure.

scenario

An optional parameter, that if it is not NULL or NA (default) will be added to the tibble as another variable. Coerced to a character if it is not already a character.

keep_cols

Either boolean, or a character vector of column names to keep in the returned tibble. The values of keep_cols work as follows:

FALSE (default) only includes the defaults columns: TraceNumber, ObjectSlot, and Value. Scenario is also returned if scenario is specified.
TRUE, all columns are returned.
A character vector, e.g., c("ObjectName", "Units"), allows the user to include other columns that are not always required, in addition to the "default" set of columns. If any of the values in keep_cols are not found, a warning will post, but all other columns will be returned.

nans_are

Either "0" or "error". If "0", then NaNs in the rwtbl are treated as 0s. If "error", then any NaNs will cause an error in this function.

find_all_slots

Boolean; if TRUE (default), then the function will abort if it cannot find a particular slot. If FALSE, then the function will continue, even if a slot cannot be found. If a slot is not found, then the function will return -99 for the Trace, and NaN for Year, and Value.

cpp

Boolean; if TRUE (default), then use rdf_to_rwtbl2, which relies on C++, otherwise, use original rdf_to_rwtbl function.

verbose

Boolean; if TRUE (default), then print out status of processing the scenario(s) and the slots in each scenario.

scenarios

A character vector of scenario foders. This is usually a vector of folder names, where each folder name contains one scenario worth of data. scenarios can be named or unnamed. The names are used as the scenario name in the returned tbl_df. Scenario names can also be specified through the scen_names argument. If scen_names is specified, scenarios should not already have names. If scen_names is not specified and, scenarios is not already named, then the scenario folders will also be used as the scenario names. See Directory Structure.

scen_dir

File path to the directory that contains the scenario folders. Directory Structure.

file

Optionally save the tbl_df of aggregated scenario data as a .txt, .csv, or .feather file. If file is specified, then the data are saved in the specified output format.

scen_names

An alternative way to specify scenario names.

Value

A tbl_df containing all aggregated and summarized data for all of the specified scenarios.

Directory Structure

RiverWare and RiverSMART typically write data into an expected directory structure. The below shows an example directory structure and corresponding variable names for rw_scen_aggregate() and rdf_aggregate(). (In the example below, C:/user/crss/CRSS.Jan2017/Scenario is the more complete directory setup for the data included in system.file("extdata/Scenario/").)

C:/user/crss
|
|- CRSS.Jan2017
|    - model
|    - ruleset
|    - Scenario
|         - ISM1988_2014,2007Dems,IG,Most
|         - ISM1988_2014,2007Dems,IG,2002 
|    - ...
|- CRSS.Jan2018
|    - model
|    - ... (same general setup as CRSS.Jan2017)

To get one scenario's data, rdf_aggregate() can be called with rdf_dir set to "C:/user/crss/CRSS.Jan2017/Scenario/ISM1988_2014,2007Dems,IG,Most". (scenario can optionally be specified to git a scenario name.)

To aggregate multiple scenarios of data together, rw_scen_aggregate() should be called with scen_dir set to "C:/user/CRSS/CRSS.Jan2017/Scenario" and scenarios set to c("ISM1988_2014,2007Dems,IG,Most", "ISM1988_2014,2007Dems,IG,2002"). (Optionally, scenarios can be named, or scen_names specified to use scenario names that are different from the above scenario folders.)

Finally, to aggregate scenario data from both CRSS.Jan2017 and CRSS.Jan2018, rw_scen_aggregate() should be called with scen_dir set to "C:/users/crss/". scenarios can then be set to c("CRSS.Jan2017/Scenario/ISM1988_2014,2007Dems,IG,Most","CRSS.Jan2018/Scenario/ISM1988_2014,2007Dems,IG,Most"), assuming the same scenario exists in both folders. In this case it is advisable to also specify scen_names or name scenarios.

Details

rdf_aggregate() aggregates a single scenario of data by processing a rwd_agg object.

In both cases, the user specifies the rwd_agg, which determines the slots that are aggregated, and how they are aggregated. See rwd_agg for more details on how it should be specified.

See the Directory Structure section for how to specify scenarios, scen_dir, and rdf_dir.

rw_scen_aggregate() aggregates multiple scenarios of data. It processes the rwd_agg object (agg) for each single scenario, and then binds all of the individual scenario data together into a single tbl_df.

Examples

Run this code

# NOT RUN {
# rdf_aggregate() ----------

rdfPath <- system.file(
  "extdata/Scenario/ISM1988_2014,2007Dems,IG,Most", 
  package = "RWDataPlyr"
)

rwa <- rwd_agg(read.csv(
  system.file(
    "extdata/rwd_agg_files/passing_aggs.csv", 
    package = "RWDataPlyr"
  ), 
 stringsAsFactors = FALSE
))

x <- rdf_aggregate(rwa[1,], rdf_dir = rdfPath, scenario = "Most")

# rw_scen_aggregate() ----------

scens <- c("ISM1988_2014,2007Dems,IG,2002", "ISM1988_2014,2007Dems,IG,Most")
scenNames <- c("2002", "Most")
namedScens <- scens
names(namedScens) <- scenNames

scenPath <- system.file("extdata/Scenario", package = "RWDataPlyr")

rwa <- rwd_agg(read.csv(
  system.file(
    "extdata/rwd_agg_files/passing_aggs.csv", 
    package = "RWDataPlyr"
  ), 
 stringsAsFactors = FALSE
))

x <- rw_scen_aggregate(namedScens, agg = rwa[1,], scen_dir = scenPath)

# y will be identical to x

y <- rw_scen_aggregate(
  scens, 
  agg = rwa[1,], 
  scen_dir = scenPath, 
  scen_names = scenNames
)

identical(x, y) # is TRUE

# }

Run the code above in your browser using DataLab