cluster_plot: Plot estimated functions for experimental units faceted by cluster versus data to assess fit.

Description

Uses as input the output object from the gpdpgrow() and gmrfdpgrow() functions.

Usage

cluster_plot(object, N_clusters = NULL, time_points = NULL, units_name = "unit", units_label = NULL, date_field = NULL, x.axis.label = NULL, y.axis.label = NULL, smoother = TRUE, sample_rate = 1, single_unit = FALSE, credible = FALSE, num_plot = NULL)

Arguments

object

A gpdpgrow or gmrfdpgrow object.

N_clusters

Denotes the number of largest sized (in terms of membership) clusters to plot. Defaults to all clusters.

time_points

Inputs a vector of common time points at which the collections of functions were observed (with the possibility of intermittent missingness). The length of time_points should be equal to the number of columns in the data matrix, y. Defaults to time_points = 1:ncol(y).

units_name

The plot label for observation units. Defaults to units_name = "function".

units_label

A vector of labels to apply to the observation units with length equal to the number of unique units. Defaults to sequential numeric values as input with data, y.

date_field

A vector of Date values for labeling the x-axis tick marks. Defaults to 1:T .

x.axis.label

Text label for x-axis. Defaults to "time".

y.axis.label

Text label for y-axis. Defaults to "function values".

smoother

A scalar boolean input indicating whether to co-plot a smoother line through the functions in each cluster.

sample_rate

A numeric value in (0,1] indicating percent of functions to randomly sample within each cluster to address over-plotting. Defaults to 1.

single_unit

A scalar boolean indicating whether to plot the fitted vs data curve for only a single experimental units (versus a random sample of 6). Defaults to single_unit = FALSE.

credible

A scalar boolean indicating whether to plot 95 percent credible intervals for estimated functions, bb, when plotting fitted functions versus data. Defaults to credible = FALSE

num_plot

A scalar integer indicating how many randomly-selected functions to plot (each in it's own plot panel) in the plot of functions versus the observed time series in the case that single_unit == TRUE. Defaults to num_plot = 6.

Value

p.cluster: A ggplot2 plot object
dat.cluster: A data.frame object used to generate p.cluster.

Examples

Run this code

{
library(growfunctions)

## load the monthly employment count data for a collection of 
## U.S. states from the Current 
## Population Survey (cps)
data(cps)
## subselect the columns of N x T, y, associated with 
## the years 2008 - 2013
## to examine the state level employment levels 
## during the "great recession"
y_short             <- cps$y[,(cps$yr_label %in% c(2008:2013))]

## Run the DP mixture of iGMRF's to estimate posterior 
## distributions for model parameters
## Under default RW2(kappa) = order 2 trend 
## precision term
res_gmrf            <- gmrfdpgrow(y = y_short, 
                                     n.iter = 40, 
                                     n.burn = 20, 
                                     n.thin = 1) 
                                     
## 2 plots of estimated functions: 1. faceted by cluster and fit;
## 2.  data for experimental units.
## for a group of randomly-selected functions
fit_plots_gmrf      <- cluster_plot( object = res_gmrf, 
                                     units_name = "state", 
                                     units_label = cps$st, 
                                     single_unit = FALSE, 
                                     credible = TRUE )   
}

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

See Also

Examples