Learn R Programming

SimMultiCorrData (version 0.2.0)

plot_cdf: Plot Theoretical Power Method Cumulative Distribution Function for Continuous Variables

Description

This plots the theoretical power method cumulative distribution function: $$F_p(Z)(p(z)) = F_p(Z)(p(z), F_Z(z)),$$ as given in Headrick & Kowalchuk (2007, 10.1080/10629360600605065). It is a parametric plot with \(sigma * y + mu\), where \(y = p(z)\), on the x-axis and \(F_Z(z)\) on the y-axis, where \(z\) is vector of \(n\) random standard normal numbers (generated with a seed set by user). Given a vector of polynomial transformation constants, the function generates \(sigma * y + mu\) and calculates the theoretical cumulative probabilities using \(F_p(Z)(p(z), F_Z(z))\). If calc_cprob = TRUE, the cumulative probability up to \(delta = sigma * y + mu\) is calculated (see cdf_prob) and the region on the plot is filled with a dashed horizontal line drawn at \(F_p(Z)(delta)\). The cumulative probability is stated on top of the line. It returns a ggplot2 object so the user can modify as necessary. The graph parameters (i.e. title, color, fill, hline) are ggplot2 parameters. It works for valid or invalid power method pdfs.

Usage

plot_cdf(c = NULL, method = c("Fleishman", "Polynomial"), mu = 0,
  sigma = 1, title = "Cumulative Distribution Function", ylower = NULL,
  yupper = NULL, calc_cprob = FALSE, delta = 5, color = "dark blue",
  fill = "blue", hline = "dark green", n = 10000, seed = 1234,
  text.size = 11, title.text.size = 15, axis.text.size = 10,
  axis.title.size = 13)

Arguments

c

a vector of constants c0, c1, c2, c3 (if method = "Fleishman") or c0, c1, c2, c3, c4, c5 (if method = "Polynomial"), like that returned by find_constants

method

the method used to generate the continuous variable \(y = p(z)\). "Fleishman" uses Fleishman's third-order polynomial transformation and "Polynomial" uses Headrick's fifth-order transformation.

mu

mean for the continuous variable (default = 0)

sigma

standard deviation for the continuous variable (default = 1)

title

the title for the graph (default = "Cumulative Distribution Function")

ylower

the lower y value to use in the plot (default = NULL, uses minimum simulated y value)

yupper

the upper y value (default = NULL, uses maximum simulated y value)

calc_cprob

if TRUE (default = FALSE), cdf_prob is used to find the cumulative probability up to \(delta = sigma * y + mu\) and the region on the plot is filled with a dashed horizontal line drawn at \(F_p(Z)(delta)\)

delta

the value \(sigma * y + mu\), where \(y = p(z)\), at which to evaluate the cumulative probability

color

the line color for the cdf (default = "dark blue")

fill

the fill color if calc_cprob = TRUE (default = "blue)

hline

the dashed horizontal line color drawn at delta if calc_cprob = TRUE (default = "dark green")

n

the number of random standard normal numbers to use in generating \(y = p(z)\) (default = 10000)

seed

the seed value for random number generation (default = 1234)

text.size

the size of the text displaying the cumulative probability up to delta if calc_cprob = TRUE

title.text.size

the size of the plot title

axis.text.size

the size of the axes text (tick labels)

axis.title.size

the size of the axes titles

Value

A ggplot2 object.

References

Fleishman AI (1978). A Method for Simulating Non-normal Distributions. Psychometrika, 43, 521-532. 10.1007/BF02293811.

Headrick TC (2002). Fast Fifth-order Polynomial Transforms for Generating Univariate and Multivariate Non-normal Distributions. Computational Statistics & Data Analysis, 40(4):685-711. 10.1016/S0167-9473(02)00072-5. (ScienceDirect)

Headrick TC (2004). On Polynomial Transformations for Simulating Multivariate Nonnormal Distributions. Journal of Modern Applied Statistical Methods, 3(1), 65-71. 10.22237/jmasm/1083370080.

Headrick TC, Kowalchuk RK (2007). The Power Method Transformation: Its Probability Density Function, Distribution Function, and Its Further Use for Fitting Data. Journal of Statistical Computation and Simulation, 77, 229-249. 10.1080/10629360600605065.

Headrick TC, Sawilowsky SS (1999). Simulating Correlated Non-normal Distributions: Extending the Fleishman Power Method. Psychometrika, 64, 25-35. 10.1007/BF02294317.

Headrick TC, Sheng Y, & Hodis FA (2007). Numerical Computing and Graphics for the Power Method Transformation Using Mathematica. Journal of Statistical Software, 19(3), 1 - 17. 10.18637/jss.v019.i03.

Wickham H. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2009.

See Also

find_constants, cdf_prob, ggplot, geom_line, geom_hline, geom_area

Examples

Run this code
# NOT RUN {
# Logistic Distribution: mean = 0, sigma = 1

# Find standardized cumulants
stcum <- calc_theory(Dist = "Logistic", params = c(0, 1))

# Find constants without the sixth cumulant correction
# (invalid power method pdf)
con1 <- find_constants(method = "Polynomial", skews = stcum[3],
                      skurts = stcum[4], fifths = stcum[5],
                      sixths = stcum[6], n = 25, seed = 1234)

# Plot cdf with cumulative probability calculated up to delta = 5
plot_cdf(c = con1$constants, method = "Polynomial",
         title = "Invalid Logistic CDF", calc_cprob = TRUE, delta = 5)

# Find constants with the sixth cumulant correction
# (valid power method pdf)
con2 <- find_constants(method = "Polynomial", skews = stcum[3],
                      skurts = stcum[4], fifths = stcum[5],
                      sixths = stcum[6], Six = seq(1.5, 2, 0.05))

# Plot cdf with cumulative probability calculated up to delta = 5
plot_cdf(c = con2$constants, method = "Polynomial",
         title = "Valid Logistic CDF", calc_cprob = TRUE, delta = 5)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab