betabin: Beta-binomial and chance-corrected beta-binomial models for over-dispersed binomial data

Description

Fits the beta-binomial model and the chance-corrected beta-binomial model to (over-dispersed) binomial data.

Usage

betabin(data, start = c(.5,.5), 
        method = c("duotrio", "threeAFC", "twoAFC", "triangle"),
        vcov = TRUE, corrected = TRUE, gradTol = 1e-4, ...)

## S3 method for class 'betabin':
summary(object, level = 0.95, ...)

Arguments

data

matrix or data.frame with two columns; first column contains the number of success and the second the total number of cases. The number of rows should correspond to the number of observations.

start

starting values to be used in the optimization

vcov

logical, should the variance-covariance matrix of the parameters be computed?

method

the sensory discrimination protocol for which d-prime and its standard error should be computed

corrected

should the chance corrected or the standard beta binomial model be estimated?

gradTol

a warning is issued if max|gradient| < gradTol, where 'gradient' is the gradient at the values at which the optimizer terminates. This is not used as a termination or convergence criterion during model fitting.

object

an object of class "betabin", i.e. the result of betabin().

level

the confidence level of the confidence intervals computed by the summary method

...

betabin: The only recognized (hidden) argument is doFit (boolean) which by default is TRUE. When FALSE betabin returns an environment which facilitates examination of the like

Value

An object of class betabin with elements
coefficientsnamed vector of coefficients
vcovvariance-covariance matrix of the parameter estimates if vcov = TRUE
datathe data supplied to the function
callthe matched call
logLikthe value of the log-likelihood at the MLEs
methodthe method used for the fit
convergence0 indicates convergence. For other error messages, see optim.
messagepossible error message - see optim for details
countsthe number of iterations used in the optimization - see optim for details
correctedis the chance corrected model estimated?
logLikNulllog-likelihood of the binomial model with prop = pGuess
logLikMulog-likelihood of a binomial model with prop = sum(x)/sum(n)

Details

The beta-binomial models are parameterized in terms of mu and gamma, where mu corresponds to a probability parameter and gamma measures over-dispersion. Both parameters are restricted to the interval (0, 1). The parameters of the standard (i.e. corrected = FALSE) beta-binomial model refers to the mean (i.e. probability) and dispersion on the scale of the observations, i.e. on the scale where we talk of a probability of a correct answer (Pc). The parameters of the chance corrected (i.e. corrected = TRUE) beta-binomial model refers to the mean and dispersion on the scale of the "probability of discrimination" (Pd). The mean parameter (mu) is therefore restricted to the interval from zero to one in both models, but they have different interpretations. The summary method use the estimate of mu to infer the parameters of the sensory experiment; Pc, Pd and d-prime. These are restricted to their allowed ranges, e.g. Pc is always as least as large as the guessing probability. Confidens intervals are computed as Wald (normal-based) intervals on the mu-scale and the confidence limits are subsequently transformed to the Pc, Pd and d-prime scales. Confidence limits are restricted to the allowed ranges of the parameters, for example no confidence limits will be less than zero. Standard errors, and therefore also confidence intervals, are only available if the parameters are not at the boundary of their allowed range (parameter space). If parameters are close to the boundaries of their allowed range, standard errors, and also confidence intervals, may be misleading. The likelihood ratio tests are more accurate. More accurate confidence intervals such as profile likelihood intervals may be implemented in the future. The summary method provides a likelihood ratio test of over-dispersion on one degree of freedom and a likelihood ratio test of association (i.e. where the null hypothesis is "no difference" and the alternative hypothesis is "any difference") on two degrees of freedom (chi-square tests). Since the gamma parameter is tested on the boundary of the parameter space, the correct degree of freedom for the first test is probably 1/2 rather than one, or somewhere in between, and the latter test is probably also on less than two degrees of freedom. Research is needed to determine the appropriate no. degrees of freedom to use in each case. The choices used here are believed to be conservative, so the stated p-values are probably a little too large. The log-likelihood of the standard beta-binomial model is $$\ell(\alpha, \beta; x, n) = - N \log Beta(\alpha, \beta) + \sum_{j=1}^N \log Beta(\alpha + x_j, \beta - x_j + n_j)$$ and the log-likelihood of the chance corrected beta-binomial model is $$\ell(\alpha, \beta; x, n) = - N \log Beta(\alpha, \beta) + \sum_{j=1}^N \log \left{ \sum_{i=1}^{x_j} {{x_j} \choose i} (1-p_g)^{n_j-x_j+i} p_g^{x_j-i} Beta(\alpha + i, n_j - x_j + \beta) \right}$$ where $\mu = \alpha/(\alpha + \beta)$, $\gamma = 1/(\alpha + \beta + 1)$, $Beta$ is the Beta function, cf. beta, $N$ is the number of independent binomial observations, i.e. the number of rows in data, and $p_g$ is the guessing probability, pGuess. The variance-covariance matrix (and standard errors) is based on the inverted Hessian at the optimum. The Hessian is obtained with the hessian function from the numDeriv package. The gradient at the optimum is evaluated with gradient from the numDeriv package. The bounded optimization is performed with the "L-BFGS-B" optimizer in optim. The following additional methods are implemented objects of class betabin: print, vcov and logLik.

References

Brockhoff, P.B. (2003). The statistical power of replications in difference tests. Food Quality and Preference, 14, pp. 405--417.

Examples

Run this code

## Create data:
x <- c(3,2,6,8,3,4,6,0,9,9,0,2,1,2,8,9,5,7)
n <- c(10,9,8,9,8,6,9,10,10,10,9,9,10,10,10,10,9,10)
dat <- data.frame(x, n)

(bb <- betabin(dat, method = "duotrio"))
(bb <- betabin(dat, corrected = FALSE, method = "duotrio"))
summary(bb)
vcov(bb)
logLik(bb)
AIC(bb)
coef(bb)

Run the code above in your browser using DataLab