Learn R Programming

⚠️There's a newer version (1.0.0) of this package.Take me there.

tidyhydat

Project Status

This package is maintained by the Knowledge Management Branch of the British Columbia Ministry of Environment and Climate Change Strategy.

What does it do?

Here is a summary of what tidyhydat does:

  • Provides functions (hy_*) that access hydrometric data from the HYDAT database, a national archive of Canadian hydrometric data and return tidy data.
  • Provides functions (realtime_*) that access Environment and Climate Change Canada's real-time hydrometric data source.
  • Provides functions (search_*) that can search through the approximately 7000 stations in the database and aid in generating station vectors
  • Keep functions as simple as possible. For example, for daily flows, the hy_daily_flows() function queries the database, tidies the data and returns a tibble of daily flows.

Installation

You can installed tidyhydat from CRAN:

install.packages("tidyhydat")

To install the development version of the tidyhydat package, you need to install the remotes package then the tidyhydat package

install.packages("remotes")
remotes::install_github("ropensci/tidyhydat")

Usage

A more thorough vignette can be found on the tidyhydat CRAN page.

To load the package you need to use the library() function. When you install tidyhydat, several other packages will be installed as well. One of those packages, dplyr, is useful for data manipulations and is used regularly here. Even though dplyr is installed alongside tidyhydat, it is helpful to load it by itself as there are many useful functions contained within dplyr. A helpful dplyr tutorial can be found here.

library(tidyhydat)
library(dplyr)

HYDAT download

To use many of the functions in the tidyhydat package you will need to download a version of the HYDAT database, Environment and Climate Change Canada's database of historical hydrometric data then tell R where to find the database. Conveniently tidyhydat does all this for you via:

download_hydat()

This downloads the most recent version of HYDAT and then saves it in a location on your computer where tidyhydat's function will look for it. Do be patient though as this takes a long time! To see where HYDAT was saved you can run hy_dir(). Now that you have HYDAT downloaded and ready to go, you are all set to begin some hydrologic analysis.

Most functions in tidyhydat follow a common argument structure. We will use the hy_daily_flows() function for the following examples though the same approach applies to most functions in the package (See ls("package:tidyhydat") for a list of exported objects). Much of the functionality of tidyhydat originates with the choice of hydrometric stations that you are interested in. A user will often find themselves creating vectors of station numbers. There are several ways to do this.

The simplest case is if you would like to extract only station. You can supply this directly to the station_number argument:

hy_daily_flows(station_number = "08LA001")
#> No start and end dates specified. All dates available will be returned.
#> All station successfully retrieved
#> # A tibble: 29,159 x 5
#>    STATION_NUMBER Date       Parameter Value Symbol
#>    <chr>          <date>     <chr>     <dbl> <chr> 
#>  1 08LA001        1914-01-01 Flow        144 <NA>  
#>  2 08LA001        1914-01-02 Flow        144 <NA>  
#>  3 08LA001        1914-01-03 Flow        144 <NA>  
#>  4 08LA001        1914-01-04 Flow        140 <NA>  
#>  5 08LA001        1914-01-05 Flow        140 <NA>  
#>  6 08LA001        1914-01-06 Flow        136 <NA>  
#>  7 08LA001        1914-01-07 Flow        136 <NA>  
#>  8 08LA001        1914-01-08 Flow        140 <NA>  
#>  9 08LA001        1914-01-09 Flow        140 <NA>  
#> 10 08LA001        1914-01-10 Flow        140 <NA>  
#> # ... with 29,149 more rows

Another method is to use hy_stations() to generate your vector which is then given the station_number argument. For example, we could take a subset for only those active stations within Prince Edward Island (Province code: PE) and then create vector which is passed to the multi-parameter function hy_daily(). This function queries the flow, level, sediment load and suspended sediment concentration tables and combines them (if present) into one dataframe:

PEI_stns <- hy_stations() %>%
  filter(HYD_STATUS == "ACTIVE") %>%
  filter(PROV_TERR_STATE_LOC == "PE") %>%
  pull(STATION_NUMBER)
#> All station successfully retrieved

PEI_stns
#> [1] "01CA003" "01CB002" "01CB004" "01CC002" "01CC005" "01CC010" "01CD005"

hy_daily(station_number = PEI_stns)
#> # A tibble: 123,225 x 5
#>    STATION_NUMBER Date       Parameter Value Symbol
#>    <chr>          <date>     <chr>     <dbl> <chr> 
#>  1 01CA003        1961-08-01 Flow         NA <NA>  
#>  2 01CA003        1961-08-02 Flow         NA <NA>  
#>  3 01CA003        1961-08-03 Flow         NA <NA>  
#>  4 01CA003        1961-08-04 Flow         NA <NA>  
#>  5 01CA003        1961-08-05 Flow         NA <NA>  
#>  6 01CA003        1961-08-06 Flow         NA <NA>  
#>  7 01CA003        1961-08-07 Flow         NA <NA>  
#>  8 01CA003        1961-08-08 Flow         NA <NA>  
#>  9 01CA003        1961-08-09 Flow         NA <NA>  
#> 10 01CA003        1961-08-10 Flow         NA <NA>  
#> # ... with 123,215 more rows

We can also merge our station choice and data extraction into one unified pipe which accomplishes a single goal. For example, if for some reason we wanted all the stations in Canada that had the name "Canada" in them we could unify those selection and data extraction processes into a single pipe:

search_stn_name("canada") %>%
  pull(STATION_NUMBER) %>%
  hy_daily_flows()
#> No start and end dates specified. All dates available will be returned.
#> All station successfully retrieved
#> # A tibble: 76,679 x 5
#>    STATION_NUMBER Date       Parameter Value Symbol
#>    <chr>          <date>     <chr>     <dbl> <chr> 
#>  1 01AK001        1918-08-01 Flow      NA    <NA>  
#>  2 01AK001        1918-08-02 Flow      NA    <NA>  
#>  3 01AK001        1918-08-03 Flow      NA    <NA>  
#>  4 01AK001        1918-08-04 Flow      NA    <NA>  
#>  5 01AK001        1918-08-05 Flow      NA    <NA>  
#>  6 01AK001        1918-08-06 Flow      NA    <NA>  
#>  7 01AK001        1918-08-07 Flow       1.78 <NA>  
#>  8 01AK001        1918-08-08 Flow       1.78 <NA>  
#>  9 01AK001        1918-08-09 Flow       1.50 <NA>  
#> 10 01AK001        1918-08-10 Flow       1.78 <NA>  
#> # ... with 76,669 more rows

These example illustrate a few ways that an vector can be generated and supplied to functions within tidyhydat.

Real-time

To download real-time data using the datamart we can use approximately the same conventions discussed above. Using realtime_dd() we can easily select specific stations by supplying a station of interest:

realtime_dd(station_number = "08LG006")

Another option is to provide simply the province as an argument and download all stations from that province:

realtime_dd(prov_terr_state_loc = "PE")

A simple plotting tool is also provided to quickly visualize realtime data:

realtime_plot("08LG006")

Getting Help or Reporting an Issue

To report bugs/issues/feature requests, please file an issue.

These are very welcome!

How to Contribute

If you would like to contribute to the package, please see our CONTRIBUTING guidelines.

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

Citation

Get citation information for tidyhydat in R by running:

citation("tidyhydat")

License

Copyright 2017 Province of British Columbia

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Copy Link

Version

Install

install.packages('tidyhydat')

Monthly Downloads

545

Version

0.3.2

License

Apache License (== 2.0) | file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Sam Albers

Last Published

January 11th, 2018

Functions in tidyhydat (0.3.2)

hy_data_symbols

DATA SYMBOLS look-up table
hy_data_types

DATA TYPES look-up table
hy_datum_list

Extract datum list from HYDAT database
hy_dir

Wrapped on rappdirs::user_data_dir("tidyhydat")
allstations

All Canadian stations
download_hydat

Download and set the path to HYDAT
hy_monthly_flows

Extract monthly flows information from the HYDAT database
hy_monthly_levels

Extract monthly levels information from the HYDAT database
hy_daily_flows

Extract daily flows information from the HYDAT database
hy_daily_levels

Extract daily levels information from the HYDAT database
hy_annual_stats

Extract annual statistics information from the HYDAT database
hy_daily

Extract all daily water level and flow measurements
hy_sed_daily_loads

Extract daily sediment load information from the HYDAT database
hy_sed_daily_suscon

Extract daily suspended sediment concentration information from the HYDAT database
hy_agency_list

hy_agency_list function
hy_annual_instant_peaks

Extract annual max/min instantaneous flows and water levels from HYDAT database
hy_sed_monthly_loads

Extract monthly flows information from the HYDAT database
hy_sed_monthly_suscon

Extract monthly flows information from the HYDAT database
hy_stn_regulation

Extract station regulation from the HYDAT database
hy_stations

Extract station information from the HYDAT database
hy_stn_data_coll

Extract station data collection from HYDAT database
hy_version

Extract version number from HYDAT database
hy_stn_remarks

Extract station remarks from HYDAT database
realtime_daily_mean

Calculate daily means from higher resolution realtime data
hy_plot

Convenience function to plot realtime data
hy_reg_office_list

Extract regional office list from HYDAT database
hy_stn_data_range

Extract station data range from HYDAT database
hy_stn_datum_conv

Extract station datum conversions from HYDAT database
realtime_stations

Download a tibble of active realtime stations
search_stn_name

A search function for hydrometric station name or number
hy_stn_datum_unrelated

Extract station datum unrelated from HYDAT database
hy_stn_op_schedule

Extract station operation schedule from HYDAT database
station_choice

Function to chose a station based on consistent arguments for hydat functions.
tidyhydat

Extract tidy water data
hy_sed_samples

Extract instantaneous sediment sample information from the HYDAT database
hy_sed_samples_psd

Extract instantaneous sediment sample particle size distribution information from the HYDAT database
realtime_dd

Download a tibble of realtime river data from the last 30 days from the Meteorological Service of Canada datamart
realtime_plot

Convenience function to plot realtime data