Learn R Programming

A dedicated Slack channel has been created for announcements, support and to help build a community of practice around this open source package. You may request an invitation to join from jonathan.callahan@dri.com.

MazamaTimeSeries

Utility functions for working with environmental time series data from known 
locations. The compact data model is structured as a list with two dataframes. A 
'meta' dataframe contains spatial and measuring device metadata associated with 
deployments at known locations. A 'data' dataframe contains a 'datetime' column 
followed by columns of measurements associated with each "device-deployment".

Background

This package supports data management activities associated with environmental time series collected at fixed locations in space. The motivating fields include both air and water quality monitoring where fixed sensors report at regular time intervals.

Data Model

The most compact format for time series data collected at fixed locations is a list including two tables. MazamaTimeSeries stores time series measurements in a data table where each row is a synoptic record containing all measurements associated with a particular UTC time stamp and each column contains data measured by a single sensor (aka "device"). Any time invariant metadata associated with a sensor at a known location (aka a "device-deployment") is stored in a separate meta table. A unique deviceDeploymentID connects the two tables. In the language of relational databases, this "normalizes" the database and can greatly reduce the disk space and memory needed to store and work with the data.

Single Time Series

Time series data from a single environmental sensor typically consists of multiple parameters measured at successive times. This data is stored in an R list containing two dataframes. The package refers to this structure as an sts object for SingleTimeSeries:

sts$meta -- 1 row = unique device-deployment; cols = device/location metadata

sts$data -- rows = UTC times; cols = measured parameters (plus an additional datetime column)

sts objects can support the following types of time series data:

  • stationary device-deployments only (no "mobile" sensors)
  • single sensor only
  • regular or irregular time axes
  • multiple parameters

Raw, "engineering data" containing uncalibrated measurements, instrument voltages and QC flags may be stored in this format. This format is also appropriate for processed and QC'ed data whenever multiple parameters are measured by a single device.

Note: The sts object time axis specified in data$datetime reflects device measurement times and is not required to have uniform spacing. (It may be regular but it need not be.)

Multiple Time Series

Working with timeseries data from multiple sensors at once is often challenging because of the amount of memory required to store all the data from each sensor. However, a common situation is to have time series that share a common time axis -- e.g. hourly measurements. In this case, it is possible to create single-parameter data dataframes that contain all data for all sensors for a single parameter of interest. In air quality applications, common parameters of interest include PM2.5 and Ozone.

Multi-sensor, single-parameter time series data is stored in an R list with two dataframes. The package refers to this structure as an mts object for MultipleTimeSeries:

mts$meta -- N rows = unique device-deployments; cols = device/location metadata

mts$data -- rows = UTC times; N cols = device-deployments (plus an additional datetime column)

A key feature of mts objects is the use of the deviceDeploymentID as a "foreign key" that allows sensor data columns to be mapped onto the associated spatial and sensor metadata in a meta row. The following will always be true:

identical(names(mts$data), c('datetime', mts$meta$deviceDeploymentID))

mts objects can support the following types of time series data:

  • stationary device-deployments only (no "mobile" sensors)
  • multiple sensors
  • regular (shared) hourly time axes only
  • single parameter only

Each column of mts$data represents a timeseries associated with a particular device-deployment while each row represents a synoptic snap shot of all measurements made at a particular time.

In this manner, software can create both timeseries plots and maps from a single mts object in memory.

Note: The mts object time axis specified in data$datetime is guaranteed to be a regular hourly axis with no gaps.


This R package was created with funding from the USFS AirFire Research Team.

Copy Link

Version

Install

install.packages('MazamaTimeSeries')

Monthly Downloads

660

Version

0.3.0

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Jonathan Callahan

Last Published

March 8th, 2024

Functions in MazamaTimeSeries (0.3.0)

mts_filterData

General purpose data filtering for mts time series objects
mts_filterDatetime

Datetime filtering for mts time series objects
mts_filterMeta

General purpose metadata filtering for mts time series objects
mts_summarize

Create summary time series for an mts time series object
mts_slice_head

Subset time series based on their position
mts_filterDate

Date filtering for mts time series objects
mts_isValid

Test mts object for correct structure
sts_filterDate

Date filtering for sts time series objects
mts_pull

Extract a column of metadata or data
sts_filter

General purpose data filtering for sts time series objects
mts_distinct

Retain only distinct data records in mts$data
example_mts

Example mts dataset
mts_extractDataFrame

Extract dataframes from mts objects
mts_combine

Combine multiple mts time series objects
mts_collapse

Collapse an mts time series object into a single time series
mts_trim

Trim mts time series by removing missing values
mts_sample

Sample time series for an mts time series object
mts_isEmpty

Test for an empty mts object
mts_selectWhere

Data-based subsetting of time series within an mts object.
sts_combine

Combine multiple sts time series objects
sts_check

Check sts object for validity
mts_setTimeAxis

Extend/contract mts time series to new start and end times
mts_getDistance

Calculate distances from mts time series locations to a location of interest
requiredMetaNames

Required columns for the 'meta' dataframe
%>%

Pipe operator
mts_select

Reorder and subset time series within an mts time series object
sts_distinct

Retain only distinct data records in sts$data
sts_filterDatetime

Datetime filtering for sts time series objects
sts_isEmpty

Test for empty sts object
sts_summarize

Create summary time series for an sts time series object
sts_isValid

Test sts object for correct structure
sts_extractDataFrame

Extract dataframes from sts objects
mts_trimDate

Trim mts time series object to full days
timeInfo

Get time related information
sts_trimDate

Trim sts time series object to full days
.sample

General table row sampling
mts_arrange

Order mts time series by metadata values
mts_check

Check mts object for validity
MazamaTimeSeries

Core functionality for environmental time series
Carmel_Valley

Carmel Valley example dataset
example_raws

Example RAWS dataset
Camp_Fire

Camp Fire example dataset
example_sts

Example sts dataset
.flagOutliers

Flag outliers in vectorized data