Learn R Programming

growfunctions (version 0.12)

cluster_plot: Plot estimated functions for experimental units faceted by cluster versus data to assess fit.

Description

Uses as input the output object from the gpdpgrow() and gmrfdpgrow() functions.

Usage

cluster_plot(object, N_clusters = NULL, time_points = NULL, units_name = "unit", units_label = NULL, date_field = NULL, x.axis.label = NULL, y.axis.label = NULL, smoother = TRUE, sample_rate = 1, single_unit = FALSE, credible = FALSE, num_plot = NULL)

Arguments

object
A gpdpgrow or gmrfdpgrow object.
N_clusters
Denotes the number of largest sized (in terms of membership) clusters to plot. Defaults to all clusters.
time_points
Inputs a vector of common time points at which the collections of functions were observed (with the possibility of intermittent missingness). The length of time_points should be equal to the number of columns in the data matrix, y. Defaults to time_points = 1:ncol(y).
units_name
The plot label for observation units. Defaults to units_name = "function".
units_label
A vector of labels to apply to the observation units with length equal to the number of unique units. Defaults to sequential numeric values as input with data, y.
date_field
A vector of Date values for labeling the x-axis tick marks. Defaults to 1:T .
x.axis.label
Text label for x-axis. Defaults to "time".
y.axis.label
Text label for y-axis. Defaults to "function values".
smoother
A scalar boolean input indicating whether to co-plot a smoother line through the functions in each cluster.
sample_rate
A numeric value in (0,1] indicating percent of functions to randomly sample within each cluster to address over-plotting. Defaults to 1.
single_unit
A scalar boolean indicating whether to plot the fitted vs data curve for only a single experimental units (versus a random sample of 6). Defaults to single_unit = FALSE.
credible
A scalar boolean indicating whether to plot 95 percent credible intervals for estimated functions, bb, when plotting fitted functions versus data. Defaults to credible = FALSE
num_plot
A scalar integer indicating how many randomly-selected functions to plot (each in it's own plot panel) in the plot of functions versus the observed time series in the case that single_unit == TRUE. Defaults to num_plot = 6.

Value

A list object containing the plot of estimated functions, faceted by cluster, and the associated data.frame object.
p.cluster
A ggplot2 plot object
dat.cluster
A data.frame object used to generate p.cluster.

See Also

gpdpgrow, gmrfdpgrow

Examples

Run this code
{
library(growfunctions)

## load the monthly employment count data for a collection of 
## U.S. states from the Current 
## Population Survey (cps)
data(cps)
## subselect the columns of N x T, y, associated with 
## the years 2008 - 2013
## to examine the state level employment levels 
## during the "great recession"
y_short             <- cps$y[,(cps$yr_label %in% c(2008:2013))]

## Run the DP mixture of iGMRF's to estimate posterior 
## distributions for model parameters
## Under default RW2(kappa) = order 2 trend 
## precision term
res_gmrf            <- gmrfdpgrow(y = y_short, 
                                     n.iter = 40, 
                                     n.burn = 20, 
                                     n.thin = 1) 
                                     
## 2 plots of estimated functions: 1. faceted by cluster and fit;
## 2.  data for experimental units.
## for a group of randomly-selected functions
fit_plots_gmrf      <- cluster_plot( object = res_gmrf, 
                                     units_name = "state", 
                                     units_label = cps$st, 
                                     single_unit = FALSE, 
                                     credible = TRUE )   
}

Run the code above in your browser using DataLab