# gravitas v0.1.2

Monthly downloads

## Explore Probability Distributions for Bivariate Temporal Granularities

Provides tools for systematically exploring large quantities of
temporal data across nonlinear temporal granularities
(deconstructions of time) by visualizing probability distributions.
Nonlinear time granularities can be circular, quasi-circular or
aperiodic. 'gravitas' computes nonlinear
single-order-up or multiple-order-up granularities, check the
feasibility of creating plots for any two nonlinear granularities
and recommend probability distributions plots for exploring
periodicity in the data.

## Readme

# gravitas

## Overview

Package `gravitas`

provides a tool to examine the probability
distribution of univariate time series across bivariate temporal
granularities using a range of graphics in `ggplot2`

through the
following:

create multiple-order-up circular or aperiodic temporal granularities.

categorize pairs of granularities as either

*harmony*or*clash*, where harmonies are pairs of granularities that aid exploratory data analysis, and clashes are pairs that are incompatible with each other for exploratory analysis.recommending appropriate probability distribution plots of the time series variable across the bivariate granularities based on the levels of the bivariate granularities and their interaction.

`gravitas`

is not restricted to temporal data. It can be utilized in
non-temporal cases for which a hierarchical structure can be construed
similar to time. The hierarchical structure of time creates a natural
nested ordering. For example, hours are nested within days, days within
weeks, weeks within months, and so on. Similarly, if a periodic nesting
exists for a non-temporal application, `gravitas`

can be used to explore
the probability distribution of a continuous random variable.

## Installation

You can install gravitas from CRAN:

```
install.packages("gravitas")
```

You can install the development version from GitHub with:

```
# install.packages("devtools")
devtools::install_github("Sayani07/gravitas")
```

## Quick look

`gravitas`

comes with an interactive webpage, which lets you go through
the different functionalities of this package. To try it, simply use
gravitas::run_app().

## Features

Search for a set of all possible temporal granularities with

`search_gran()`

Build any temporal granularity with

`create_gran()`

Check if two temporal granularities are harmonies with

`is_harmony()`

Get all possible harmonies with

`harmony()`

Get recommendations on choosing more appropriate distribution plots and advice on the interaction between granularities, number of observations available for drawing probability distributions for chosen granularities with

`gran_advice()`

Validate if the created granularity matches your already existing column with

`validate_gran()`

Explore probability distribution across bivariate temporal granularities with

`prob_plot()`

## Example: temporal case

The probability distribution of energy consumption for ten households from customer trials can be explored as follows:

### Search for granularities

```
library(gravitas)
library(dplyr)
#> Warning: package 'dplyr' was built under R version 3.5.2
library(ggplot2)
#> Warning: package 'ggplot2' was built under R version 3.5.2
library(lvplot)
smart_meter10 %>%
search_gran(highest_unit = "week")
#> [1] "hhour_hour" "hhour_day" "hhour_week" "hour_day" "hour_week"
#> [6] "day_week"
```

### Screen harmonies from the search list

```
smart_meter10 %>%
harmony(
ugran = "day",
filter_in = "wknd_wday"
)
#> # A tibble: 7 x 4
#> facet_variable x_variable facet_levels x_levels
#> <chr> <chr> <int> <int>
#> 1 hour_day hhour_hour 24 2
#> 2 wknd_wday hhour_hour 2 2
#> 3 wknd_wday hhour_day 2 48
#> 4 hhour_hour hour_day 2 24
#> 5 wknd_wday hour_day 2 24
#> 6 hhour_hour wknd_wday 2 2
#> 7 hour_day wknd_wday 24 2
```

### Visualize probability distribution of the harmony pair (wknd_wday, hour_day)

Box plots are drawn across hours of the day faceted by weekend/weekday
for a household in the data `smart_meter10`

in the package. It is
interesting to see the difference in energy consumption in the early
morning hours (1 am - 7 am) for weekdays and weekends. The interquartile
range is the same in these hours, however, median energy consumption for
these hours is higher for weekends compared to weekdays. The
distribution for the rest of the day is not significantly different for
weekdays and weekends, implying their energy behavior after 7 am remains
more or less similar.

```
smart_meter10 %>%
filter(customer_id == 10017936) %>%
prob_plot(
gran1 = "wknd_wday",
gran2 = "hour_day",
response = "general_supply_kwh",
plot_type = "boxplot") +
scale_y_sqrt() + scale_x_discrete(breaks = seq(0, 23, 3))
```

## Example: non-temporal case

`cricket`

data set in the package can be explored by explicitly defining
a hierarchy table as follows:

```
library(tsibble)
#> Warning: package 'tsibble' was built under R version 3.5.2
#>
#> Attaching package: 'tsibble'
#> The following object is masked from 'package:dplyr':
#>
#> id
cricket_tsibble <- cricket %>%
mutate(data_index = row_number()) %>%
as_tsibble(index = data_index)
hierarchy_model <- tibble::tibble(
units = c("index", "ball", "over", "inning", "match"),
convert_fct = c(1, 6, 20, 2, 1)
)
cricket_tsibble %>%
create_gran(
"over_inning",
hierarchy_model
)
#> # A tsibble: 8,560 x 12 [1]
#> season match_id batting_team bowling_team inning over wicket dot_balls
#> <dbl> <dbl> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 2008 2 Chennai Sup… Kings XI Pu… 1 1 0 4
#> 2 2008 2 Chennai Sup… Kings XI Pu… 1 2 0 2
#> 3 2008 2 Chennai Sup… Kings XI Pu… 1 3 1 4
#> 4 2008 2 Chennai Sup… Kings XI Pu… 1 4 0 3
#> 5 2008 2 Chennai Sup… Kings XI Pu… 1 5 0 3
#> 6 2008 2 Chennai Sup… Kings XI Pu… 1 6 0 3
#> 7 2008 2 Chennai Sup… Kings XI Pu… 1 7 1 1
#> 8 2008 2 Chennai Sup… Kings XI Pu… 1 8 1 3
#> 9 2008 2 Chennai Sup… Kings XI Pu… 1 9 0 1
#> 10 2008 2 Chennai Sup… Kings XI Pu… 1 10 0 2
#> # … with 8,550 more rows, and 4 more variables: runs_per_over <dbl>,
#> # run_rate <dbl>, data_index <int>, over_inning <fct>
```

#### Visualize granularities for non-temporal data

Letter value plot of total runs per over is shown overs of the innings (x-axis) and innings of the match (facet). It can be observed that there is no clear upward shift in runs in the second innings as compared to the first innings. The variability in runs increases as the teams approach towards the end of the innings, as observed through the longer and more distinct letter values.

```
cricket_tsibble %>%
filter(batting_team %in% c("Mumbai Indians",
"Chennai Super Kings"))%>%
prob_plot("inning", "over",
hierarchy_model,
response = "runs_per_over",
plot_type = "lv")
```

## More information

View the vignette to get started!

This package takes tsibble as the data input. Tsibble provides a data class of tbl_ts to represent tidy temporal data. It consists of a time index, key and other measured variables in a data-centric format, which makes it easier to work with temporal data. To learn more about it, please visit https://tsibble.tidyverts.org/

## Acknowledgements

Thanks to PhD supervisors Prof. Rob J Hyndman, Prof. Dianne Cook and Google Summer of Code 2019 mentor Prof. Antony Unwin for their support and always leading by example. The fine balance of encouraging me to work on my ideas and stepping in to help when I need has made the development of this package a great learning experience for me.

Moreover, I want to thank my cohort at NUMBATS, Monash University, especially Mitchell O’Hara-Wild and Nicholas Spyrison for always lending an ear and sharing their wisdom and experience of developing R packages, with such kindness.

## Reporting and issues

Please submit all bug reports, errors, and feature requests to https://github.com/Sayani07/gravitas/issues

## Functions in gravitas

Name | Description | |

print.gran_advice | Advice summaries for granularities | |

prob_plot | Plotting probability distributions across granularities | |

search_gran | Search for granularities | |

create_gran | Build dynamic temporal granularities | |

smart_meter10 | Smart meter data for ten households | |

cricket | Cricket data set for different seasons of Indian Premier League | |

is_harmony | Check if two temporal granularities are harmonies | |

harmony | Get possible set of harmonies for a given tsibble | |

run_app | Runs a shiny app demonstrating functionalities of gravitas | |

reexports | Objects exported from other packages | |

validate_gran | Validate created granularities with existing columns | |

gran_advice | Advice summaries for granularities | |

gran_obs | Cross tabulation of granularities useful for validating if number of observations are sufficient to compute probability distributions | |

No Results! |

## Vignettes of gravitas

Name | ||

cricket.Rmd | ||

gravitas_vignette.Rmd | ||

No Results! |

## Last month downloads

## Details

Type | Package |

VignetteBuilder | knitr |

BugReports | https://github.com/Sayani07/gravitas/issues |

License | GPL-3 |

URL | https://github.com/Sayani07/gravitas/ |

Encoding | UTF-8 |

LazyData | true |

ByteCompile | true |

RoxygenNote | 7.0.2 |

NeedsCompilation | no |

Packaged | 2020-02-16 13:12:48 UTC; sgup0008 |

Repository | CRAN |

Date/Publication | 2020-02-17 09:20:02 UTC |

suggests | covr , knitr , purrr (>= 0.3.2) , readr , rmarkdown , shinyalert , shinythemes , testthat (>= 2.1.0) , tsibbledata (>= 0.1.0) |

imports | dplyr (>= 0.8.0) , ggplot2 (>= 3.1.1) , ggridges , ineq , lubridate (>= 1.7.4) , lvplot , magrittr , RColorBrewer , rlang (>= 0.3.4) , shiny , stats , stringr (>= 1.4.0) , tibble (>= 2.1.1) , tidyr (>= 0.8.3) , tsibble (>= 0.8.0) , utils |

depends | R (>= 3.1.3) |

Contributors | Di Cook, Rob Hyndman, Antony Unwin |

#### Include our badge in your README

```
[![Rdoc](http://www.rdocumentation.org/badges/version/gravitas)](http://www.rdocumentation.org/packages/gravitas)
```