ppp_srsc: Calculates PPP for Models of a single reader and a single modality (Calculation is correct! :'-D)

Description

Calculates Posterior Predictive P value for chi square (goodness of fit)

Usage

ppp_srsc(
  StanS4class,
  Colour = TRUE,
  dark_theme = TRUE,
  plot = TRUE,
  summary = TRUE,
  replicate.number.from.model.for.each.MCMC.sample = 100
)

Arguments

StanS4class

An S4 object of class stanfitExtended which is an inherited class from the S4 class stanfit. This R object is a fitted model object as a return value of the function fit_Bayesian_FROC().

It can be passed to DrawCurves(), ppp() and ... etc

Colour

Logical: TRUE of FALSE. whether Colour of curves is dark theme or not.

dark_theme

TRUE or FALSE

plot

Logical, whether replicated data are drawn, in the following notation, replicated data are denoted by $y_1,y_2,...,y_N$.

summary

Logical: TRUE of FALSE. Whether to print the verbose summary. If TRUE then verbose summary is printed in the R console. If FALSE, the output is minimal. I regret, this variable name should be verbose.

replicate.number.from.model.for.each.MCMC.sample

A positive integer, representing $J$ in the following notation. Now, I think all I needed is love! ttu ttu tututu Love is all I need.

Suppose that $$\theta_1, \theta_2, \theta_3,...,\theta_n$$ is drawn from posterior $\pi(\theta|D)$ of given data $D$.

Let $y_1,y_2,...,y_n$ be samples drawn from

$$y_1 \sim likelihood ( . |\theta_1), $$ $$y_2 \sim likelihood ( . |\theta_2),$$ $$y_3 \sim likelihood ( .|\theta_3),$$ $$...,$$ $$y_n \sim likelihood ( . |\theta_N),$$

Then the list of return values retains the following:

chisq_at_observed_data: $$\chi (D|\theta_1), \chi (D|\theta_2), \chi (D|\theta_3),...,\chi (D|\theta_n),$$
chisq_not_at_observed_data: $$\chi (y_1|\theta_1), \chi (y_2|\theta_2), \chi (y_3|\theta_3),...,\chi (y_n|\theta_n), $$
Logical: The i-th component is a logical vector indicating whether $$\chi (y_2|\theta_2) > \chi (D|\theta_2)$$ is satisfied or not. Oppai ga Ippai. If TRUE, then the inequality holds.
p.value: From the component Logical, we calculate the so-called Posterior Predictive P value. Note that the author hate this notion!! I hate it!! Akkan Beeeee!!!

Value

A list, including p value and materials to calculate it.

Details

In addition, this function plots replicated datasets from model at each MCMC sample generated by HMC. Using the Hamiltonian Monte Carlo Sampling: HMC. we can draw the MCMC samples of size $n$, say $$\theta_1, \theta_2, \theta_3,...,\theta_n $$, namely, $$\theta_1 \sim \pi(.|D), $$ $$\theta_2 \sim \pi(.|D), $$ $$\theta_3 \sim \pi(.|D),$$ $$...,$$ $$\theta_n \sim \pi(.|D).$$ where $\pi(\theta|D)$ is the posterior for given data $D$.

Then, the function plots the following datasets $y_1^1,y_2^1,...,y_I^J$.

$$ \chi(y_{1,1}|\theta_1), \chi(y_{1,2}|\theta_1), \chi(y_{1,3}|\theta_1),..., \chi(y_{1,j}|\theta_1),...., \chi(y_{1,J}|\theta_1),$$ $$ \chi(y_{2,1}|\theta_2), \chi(y_{2,2}|\theta_2), \chi(y_{2,3}|\theta_2),..., \chi(y_{2,j}|\theta_2),...., \chi(y_{2,J}|\theta_2),$$ $$ \chi(y_{3,1}|\theta_3), \chi(y_{3,2}|\theta_3), \chi(y_{3,3}|\theta_3),..., \chi(y_{3,j}|\theta_3),...., \chi(y_{3,J}|\theta_3),$$ $$...,$$ $$ \chi(y_{i,1}|\theta_i), \chi(y_{i,2}|\theta_i), \chi(y_{i,3}|\theta_i),..., \chi(y_{i,j}|\theta_i),...., \chi(y_{I,J}|\theta_i),$$ $$...,$$ $$ \chi(y_{I,1}|\theta_I), \chi(y_{I,2}|\theta_I), \chi(y_{I,3}|\theta_I),..., \chi(y_{I,j}|\theta_I),...., \chi(y_{I,J}|\theta_I).$$

where $L ( . |\theta_i)$ is a likelihood at parameter $\theta_i$.

Let $ \chi(y|\theta) $ be a chi square goodness of fit statistics of our hierarchical Bayesian Model

$$\chi(y|\theta) := \sum_{r=1}^R \sum_{m=1}^M \sum_{c=1}^C ( \frac { ( H_{c,m,r}-N_L\times p_{c,m,r})^2}{N_L\times p_{c,m,r}} + \frac{(F_{c,m,r}-(\lambda_{c} -\lambda_{c+1} )\times N_{L})^2}{(\lambda_{c} -\lambda_{c+1} )\times N_{L} }).$$

and a chi square goodness of fit statistics of our non-hierarchical Bayesian Model

$$\chi(y|\theta) := \sum_{c=1}^C \biggr( \frac{( H_{c}-N_L\times p_{c})^2}{N_L\times p_{c}} + \frac{(F_{c}-(\lambda_{c} -\lambda_{c+1} ) )\times N_{L}]^2}{(\lambda_{c} -\lambda_{c+1} )\times N_{L} }\biggr).$$

where a dataset $y$ denotes $ (F_{c,m,r}, H_{c,m,r}) $ in MRMC case and $ (F_{c}, H_{c}) $ in a single reader and a single modality case, and model parameter $\theta$.

Then we can calculate the posterior predictive p value for a given dataset $y_0$.

$$ \int \int I( \chi(y|\theta) > \chi(y_0|\theta) ) f(y|\theta) \pi(\theta|y_0) d \theta d y $$ $$ \approx \int \sum_i I( \chi(y|\theta_i) > \chi(y_0|\theta_i) ) f(y|\theta_i) d y $$ $$ \approx \sum_{j=1}^J \sum_{i=1}^I I( \chi(y_{i,j}|\theta_i) > \chi(y_0|\theta_i) ) $$

When we plot these synthesized data-sets $y_{i,j}$, we use the jitter() which adds a small amount of noise to avoid overlapping points. For example, jitter(c(1,1,1,1)) returns values: 1.0161940 1.0175678 0.9862400 0.9986126, which is changed from 1,1,1,1 to be not exactly 1 by adding tiny errors to avoid overlapping. I love you. 2019 August 19 Nowadays, I cannot remove my self from some notion, such as honesty, or pain, or,.. maybe these thing is no longer with myself. This programm is made to fix previous release calculation. Now, this programm calculates correct p value.

So... I calculate the ppp for MCMC and Graphical User Interface based on Shiny for MRMC, which should be variable such as number of readers, modalities, to generate such ID vectors automatically. Ha,... tired! Boaring, I want to die...t, diet!! Tinko, tinko unko unko. Manko manko. ha.

Leberiya, he will be die, ha... he cannot overcome, very old, old guy. I will get back to meet him. Or I cannot meet him? Liberiya,...very wisdom guy, Ary you already die? I will get back with presents for you. Ball, I have to throgh ball, and he will catch it.

The reason why the author made the plot of data drawn from Posterior Predictive likelihoods with each MCMC parameters is to understand our programm is correct, that is, each drawing is very mixed. Ha,.... when wright this,... I always think who read it. I love you, Ruikobach. Ruikobach is tiny and tiny, but,... cute. Ruikosan...Ruiko... But he has time only several years. He will die, he lives sufficiently so long, ha.

Using this function, user would get reliable posterior predictive p values, Cheers! Pretty Crowd!

We note that the calculation of posterior perdictive p value (PPP) relies on the law of large number. Thus, in order to obtain the relicable PPP, we need to enough large MCMC samples to approximate the double integral of PPP. For example, the MCMC samples is small, then R hat is far from 1 but, the low MCMC samples leads us to incorrect p value which sometimes said that the model is correct even if the R hat criteria reject the MCMC results.

Examples

Run this code

# NOT RUN {

# }
# NOT RUN {
#========================================================================================
#            1) Create a fitted model object with data named  "d"
#========================================================================================



fit <- fit_Bayesian_FROC( dataList = d,
                              ite  = 222 # to restrict running time, but it is too small
                           )



#========================================================================================
#            2) Calculate p value and meta data
#========================================================================================



                              ppp <- ppp_srsc(fit)



#========================================================================================
#            3) Extract a p value
#========================================================================================




                              ppp$p.value


# Revised 2019 August 19
# Revised 2019 Nov 27

# }
# NOT RUN {


# }

Run the code above in your browser using DataLab