rvalues: R-values

Description

Given data on a collection of units, this function computes r-values which are percentiles constructed to maximize the agreement between the reported percentiles and the percentiles of the effect of interest. Additional details about r-values are provided below and can also be found in the listed references.

Usage

rvalues(data, family = gaussian, hypers = "estimate", prior = "conjugate",
       alpha.grid = NULL, ngrid = NULL, smooth = "none", control = list())

Arguments

data

A data frame or a matrix with the number of rows equal to the number of sampling units. The first column should contain the main estimates, and the second column should contain the nuisance terms.

family

An argument which determines the sampling distribution; this could be either family = gaussian, family = tdist, family = binomial, family = poisson

hypers

values of the hyperparameters; only meaningful when the conjugate prior is used; if set to "estimate", the hyperparameters are found through maximum likelihood; if not set to "estimate" the user should supply a vector of length two.

prior

the form of the prior; either prior="conjugate" or prior="nonparametric".

alpha.grid

a numeric vector of points in (0,1); this grid is used in the discrete approximation of r-values

ngrid

number of grid points for alpha.grid; only relevant when alpha.grid=NULL

smooth

either smooth="none" or smooth takes a value between 0 and 10; this determines the level of smoothing applied to the estimate of \(\lambda(\alpha)\) (see below for the definition of \(\lambda(\alpha)\)); if smooth is given a number, the number is used as the bass argument in supsmu.

control

a list of control parameters for estimation of the prior; only used when the prior is nonparametric

Value

An object of class "rvals" which is a list containing at least the following components:

main

a data frame containing the r-values, the r-value rankings along with the rankings from several other common procedures

aux

a list containing other extraneous information

rvalues

a vector of r-values

Details

The r-value computation assumes the following two-level sampling model \( X_i|\theta_i\) ~ \(p(x|\theta_i,\eta_i)\) and \(\theta_i\) ~ \(F\), for \(i = 1,...,n\), with parameters of interest \(\theta_i\), effect size estimates \(X_i\), and nuisance terms \(\eta_i\). The form of \(p(x|\theta_i,\eta_i)\) is determined by the family argument. When family = gaussian, it is assumed that \(X_i|\theta_i,\eta_i\) ~ N(\(\theta_i,\eta_i^{2})\). When family = binomial, the \((X_i,\eta_i)\) represent the number of successes and number of trials respectively, and it is assumed that \(X_i|\theta_i,\eta_i\) ~ Binomial\((\theta_i,\eta_i)\). When family = poisson, the \({X_i}\) should be counts, and it is assumed that \(X_i|\theta_i,\eta_i\) ~ Poisson(\(\theta_i * \eta_i)\).

The distribution of the effect sizes \(F\) may be a parametric distribution that is conjugate to the corresponding family argument, or it may be estimated nonparametrically. When it is desired that \(F\) be parametric (i.e., prior = "conjugate"), the default is to estimate the hyperparameters (i.e., hypers = "estimate"), but these may be supplied by the user as a vector of length two. To estimate \(F\) nonparametrically, one should use prior = "nonparametric" (see npmle for further details about nonparametric estimation of \(F\)).

The r-value, \(r_i\), assigned to the ith case of interest is determined by \( r_i = \) inf[ \(0 < \alpha < 1: V_\alpha(X_i,\eta_i) \ge \lambda(\alpha) \) ] where \(V_\alpha(X_i,\eta_i) = P( \theta_i \ge \theta_\alpha|X_i,\eta_i) \) is the posterior probability that \(\theta_i\) exceeds the threshold \(\theta_\alpha\), and \(\lambda(\alpha)\) is the upper-\(\alpha\)th quantile associated with the marginal distribution of \(V_\alpha(X_i,\eta_i)\) (i.e., \( P(V_\alpha(X_i,\eta_i) \ge \lambda(\alpha)) = \alpha). \) Similarly, the threshold \(\theta_\alpha\) is the upper-\(\alpha\)th quantile of \(F\) (i.e., \(P(\theta_i \ge \theta_\alpha) = \alpha\) ).

References

Henderson, N.C. and Newton, M.A. (2015) Making the Cut: Improved Ranking and Selection for Large-Scale Inference. http://arxiv.org/abs/1312.5776

Examples

Run this code

# NOT RUN {
### Binomial example with Beta prior:
data(fluEnrich)
flu.rvals <- rvalues(fluEnrich, family = binomial)
hist(flu.rvals$rvalues)

### look at the r-values for indices 10 and 2484
fig_indices  <- c(10,2484)
fluEnrich[fig_indices,]

flu.rvals$rvalues[fig_indices]

### Gaussian sampling distribution with nonparametric prior
### Use a maximum of 5 iterations for the nonparam. estimate
data(hiv)
hiv.rvals <- rvalues(hiv, prior = "nonparametric")
# }

Run the code above in your browser using DataLab