Learn R Programming

fect (version 2.0.5)

esplot: Event Study Visualization

Description

Visualize dynamic treatment effects and create an event study plot. This function offers flexibility in displaying estimates, confidence intervals, and various annotations. It can handle data directly or from `did_wrapper` objects, calculate confidence intervals from standard errors if needed, and allows for connected (line/ribbon) or point-range style plots.

Usage

esplot(data, Period = NULL, Estimate = "ATT", SE = NULL,
       CI.lower = "CI.lower", CI.upper = "CI.upper", Count = NULL,
       proportion = 0.3, est.lwidth = NULL, est.pointsize = NULL,
       show.points = FALSE, fill.gap = TRUE, start0 = FALSE,
       only.pre = FALSE, only.post = FALSE, show.count = NULL,
       stats = NULL, stats.labs = NULL, highlight.periods = NULL,
       highlight.colors = NULL, lcolor = NULL, lwidth = NULL,
       ltype = c("solid", "solid"), connected = FALSE, ci.outline = FALSE,
       main = NULL, xlim = NULL, ylim = NULL, xlab = NULL, ylab = NULL,
       gridOff = FALSE, stats.pos = NULL, theme.bw = TRUE,
       cex.main = NULL, cex.axis = NULL, cex.lab = NULL,
       cex.text = NULL, axis.adjust = FALSE, color = "#000000",
       count.color = "gray70", count.alpha = 0.4,
       count.outline.color = "grey69")

Value

p

A ggplot object representing the event study plot.

Arguments

data

The input data for the event study plot. Can be a data.frame or an object of class did_wrapper (in which case, the est.att component will be used).

Period

The name of the column in data representing the relative time period. If NULL (the default), the function will attempt to automatically identify the period column from common names (e.g., 'time', 'Time', 'period', 'Period', 'event.time', 'event_time', 'rel_time') or use numeric rownames if available.

Estimate

The name of the column in data containing the point estimates (e.g., Average Treatment Effect on the Treated). Default is "ATT".

SE

The name of the column in data containing the standard errors. If columns for CI.lower and CI.upper (as specified by their respective arguments, defaulting to "CI.lower" and "CI.upper") are not found in data, and an SE column is provided and found, 95% confidence intervals will be calculated using Estimate +/- 1.96 * SE. Default is NULL.

CI.lower

The name of the column in data for the lower bound of the confidence interval. Default is "CI.lower". If this column is not found, it may be calculated from Estimate and SE (if SE is provided and found in data).

CI.upper

The name of the column in data for the upper bound of the confidence interval. Default is "CI.upper". If this column is not found, it may be calculated from Estimate and SE (if SE is provided and found in data).

Count

Optional. The name of the column in data indicating a count measure (e.g., number of observations) for each time period. Used if show.count = TRUE or for xlim determination based on proportion. Default is NULL.

proportion

Numeric, between 0 and 1. If Count is specified and xlim is not, this proportion is used to determine the default x-axis limits. Periods where the count is below proportion * max(Count) might be excluded from the default view. Default is 0.3.

est.lwidth

Numeric. The line width for the estimate line (if connected = TRUE) or the main vertical line of the point-range (if connected = FALSE). Default is NULL, which resolves to 0.6 if connected = FALSE (the default for connected), 1.2 if connected = TRUE and show.points = FALSE (the default for show.points), and 0.7 if connected = TRUE and show.points = TRUE.

est.pointsize

Numeric. The size of the points. If connected = TRUE and show.points = TRUE, this is the size of points at integer time periods. If connected = FALSE, this controls the size of the central point in the point-range (via fatten aesthetic). Default is NULL, which resolves to 2 if connected = FALSE (the default for connected), 3 if connected = TRUE and show.points = FALSE (the default for show.points), and 1.2 if connected = TRUE and show.points = TRUE.

show.points

Logical. If connected = TRUE, whether to display points at integer time periods on top of the line and ribbon. Default is FALSE.

fill.gap

Logical. If connected = FALSE, whether to fill gaps in the sequence of time periods with an estimate and confidence interval of 0. This is useful when some integer time periods are missing from the input data. Default is TRUE.

start0

Logical. If TRUE, the vertical line separating pre- and post-treatment periods is drawn at x = -0.5, implying period 0 is the first post-treatment period. If FALSE (default), the line is at x = 0.5, implying period 0 is the last pre-treatment period.

only.pre

Logical. If TRUE, the plot will only display pre-treatment periods. The vertical separator line will be omitted. Default is FALSE.

only.post

Logical. If TRUE, the plot will only display post-treatment periods. The vertical separator line will be omitted. Default is FALSE.

show.count

Logical or NULL. Whether to display a bar plot of the values from the Count column at the bottom of the main plot. If NULL (default), it's treated as FALSE.

stats

Optional. A numeric vector of statistics (e.g., p-values) to be printed on the plot.

stats.labs

Optional. A character vector of labels corresponding to the stats values. Must be the same length as stats.

highlight.periods

Optional. A numeric vector of time periods to highlight with different colors. For connected = TRUE, these define intervals from period - 0.5 to period + 0.5. For connected = FALSE, individual points at these periods are highlighted.

highlight.colors

Optional. A character vector of colors corresponding to highlight.periods. If NULL and highlight.periods is provided, default rainbow colors are used. Must be the same length as highlight.periods.

lcolor

Optional. Color(s) for the reference lines. Can be a single color (applied to both horizontal y=0 line and vertical pre/post separator line) or a vector of two colors (first for horizontal, second for vertical). If NULL, defaults to "#aaaaaa" if theme.bw = TRUE, otherwise "white". Default is NULL.

lwidth

Optional. Line width(s) for the reference lines. Can be a single width or a vector of two widths (similar to lcolor). If NULL, defaults to 1.5 if theme.bw = TRUE, otherwise 2. Default is NULL.

ltype

Optional. Linetype(s) for the reference lines. Can be a single linetype (applied to both horizontal y=0 line and vertical pre/post separator line) or a vector of two linetypes (first for horizontal, second for vertical). Default is c("solid", "solid").

connected

Logical. If TRUE, estimates and confidence intervals are plotted as a connected line with a ribbon. This involves interpolating values between observed time points (at 0.5 steps by default) to create a smoother appearance. If FALSE (default), geom_pointrange is used, showing individual estimates and their CIs as points with ranges for each observed time period.

ci.outline

Logical. If connected = TRUE, whether to draw an outline around the confidence interval ribbon. The outline color is a slightly darker version of the fill color. Default is FALSE.

main

Optional. The main title for the plot. If NULL (default), a default title "Estimated Dynamic Treatment Effects" is used. If an empty string "" is provided, no title is displayed.

xlim

Optional. A numeric vector of length 2 specifying the x-axis limits (c(min, max)). If NULL (default), limits are determined automatically based on the data range, potentially filtered by proportion if Count is used.

ylim

Optional. A numeric vector of length 2 specifying the y-axis limits (c(min, max)). If NULL (default), limits are determined automatically to encompass all estimates and confidence intervals, with potential expansion if show.count = TRUE.

xlab

Optional. The label for the x-axis. If NULL (default), "Time Relative to Treatment" is used. If an empty string "" is provided, no label is displayed.

ylab

Optional. The label for the y-axis. If NULL (default), "Effect on Y" is used. If an empty string "" is provided, no label is displayed.

gridOff

Logical. Whether to turn off major and minor grid lines. Default is TRUE.

stats.pos

Optional. A numeric vector of length 2 (c(x, y)) specifying the coordinates for the top-left position of the stats text block. If NULL (default), the position is automatically determined.

theme.bw

Logical. Whether to use ggplot2::theme_bw(). Default is TRUE.

cex.main

Optional. Numeric scaling factor for the plot title font size. The base size used by ggplot is 16. Default is NULL (uses base size 16).

cex.axis

Optional. Numeric scaling factor for the axis tick mark labels font size. The base size used by ggplot is 15. Default is NULL (uses base size 15).

cex.lab

Optional. Numeric scaling factor for the axis title (x and y labels) font size. The base size used by ggplot is 15. Default is NULL (uses base size 15).

cex.text

Optional. Numeric scaling factor for annotated text elements (e.g., stats text, count label). The base size used by ggplot for annotation is 5. Default is NULL (uses base size 5).

axis.adjust

Logical. If TRUE, x-axis tick labels are rotated 45 degrees for better readability with many labels. Default is FALSE.

color

Character. The primary color for plotting estimates, points, lines, and confidence interval fills/lines (unless overridden by highlight.colors for specific periods). Default is "#000000" (black).

count.color

Character. The fill color for the bars if show.count = TRUE. Default is "gray70".

count.alpha

Numeric. Alpha transparency for the count bars if show.count = TRUE. Default is 0.4.

count.outline.color

Character. The color for the outline of count bars if show.count = TRUE. Default is "grey69".

Author

Licheng Liu, Yiqing Xu, Ziyi Liu, Zhongyu Yin, Rivka Lipkovitz

Examples

Run this code
# Basic example with simulated data
set.seed(123)
event_data <- data.frame(
  time = -5:5,
  ATT = cumsum(rnorm(11, 0, 0.2)) + c(rep(0,5), 0, 0.5, 1, 1.2, 1.5, 1.3),
  SE = runif(11, 0.1, 0.3)
)
event_data$CI.lower <- event_data$ATT - 1.96 * event_data$SE
event_data$CI.upper <- event_data$ATT + 1.96 * event_data$SE
event_data$count <- sample(50:150, 11, replace = TRUE)
event_data$count[event_data$time == -5 | event_data$time == 5] <- 20 # for proportion demo

# Default plot (point-range)
esplot(event_data, Period = "time", Estimate = "ATT",
       CI.lower = "CI.lower", CI.upper = "CI.upper")

# # Connected plot with ribbon
# esplot(event_data, Period = "time", Estimate = "ATT",
#        CI.lower = "CI.lower", CI.upper = "CI.upper",
#        connected = TRUE, show.points = TRUE)

# # Connected plot using SE for CI calculation
# event_data_no_ci <- event_data[, c("time", "ATT", "SE", "count")]
# esplot(event_data_no_ci, Period = "time", Estimate = "ATT", SE = "SE",
#        connected = TRUE, ci.outline = TRUE, color = "blue")

# # Show count bars and stats
# esplot(event_data, Period = "time", Estimate = "ATT",
#        CI.lower = "CI.lower", CI.upper = "CI.upper", Count = "count",
#        show.count = TRUE, stats = c(0.03, 0.12), stats.labs = c("P-val Pre", "P-val Post"),
#        main = "Event Study with Counts and Stats", proportion = 0.2)

# # Highlight specific periods (connected)
# esplot(event_data, Period = "time", Estimate = "ATT", SE = "SE",
#        connected = TRUE, highlight.periods = c(-1, 2),
#        highlight.colors = c("orange", "green"),
#        main = "Highlighted Periods (Connected)")

# # Highlight specific periods (point-range)
# esplot(event_data, Period = "time", Estimate = "ATT", SE = "SE",
#        connected = FALSE, highlight.periods = c(-1, 2),
#        highlight.colors = c("orange", "green"),
#        main = "Highlighted Periods (Point-Range)")

# # Only post-treatment period, custom labels
# esplot(event_data, Period = "time", Estimate = "ATT", SE = "SE",
#        only.post = TRUE, xlab = "Years Post-Intervention", ylab = "Impact Metric",
#        start0 = TRUE, color = "darkred", est.lwidth = 1.5)

# Using did_wrapper object (conceptual example, requires `did` package and setup)
# if (requireNamespace("did", quietly = TRUE)) {
#   # Assume `did_out` is an output from `did::att_gt` or similar
#   # and `did_wrapper_obj` is created, e.g.,
#   # did_wrapper_obj <- list(est.att = event_data) # Simplified for example
#   # class(did_wrapper_obj) <- "did_wrapper"
#   # esplot(did_wrapper_obj) # Would use defaults: Period="time", Estimate="ATT"
# }

Run the code above in your browser using DataLab