Learn R Programming

abc (version 1.0)

postpr: Estimating posterior model probabilities

Description

This function performs a simple rejection or a multinomial logistic regression to estimate the posterior probability of a model relative to the other models in consideration.

Usage

postpr(target, index, sumstat, tol, subset = NULL, method,
kernel="epanechnikov", numnet = 10, sizenet = 5, lambda =
c(0.0001,0.001,0.01), trace = TRUE, maxit = 500, ...)

Arguments

target
a vector of the observed summary statistics.
index
a vector of model indices. It can be character or numeric and will be coerced to factor. It must have the same length as sumstat to indicate which row of sumstat belong to which model.
sumstat
a vector, matrix or data frame of the simulated summary statistics.
tol
numeric, the required proportion of points nearest the target values (tolerance), or a vector of the desired tolerance values. If a vector is given
subset
a logical expression indicating elements or rows to keep. Missing values in index and/or sumstat are taken as FALSE.
method
a character string indicating the type of simulation required. Possible values are "rejection", "mnlogistic", "neuralnet". See Details.
kernel
a character string specifying the kernel to be used when method is "mnlogistic" or "neuralnet". Defaults to "epanechnikov". See density for details.
numnet
the number of neural networks when method is "neuralnet". It corresponds to the number of times the function nnet is called.
sizenet
the number of units in the hidden layer. Can be zero if there are no skip-layer units.
lambda
a numeric vector or a single value indicating the weight decay when method is "neuralnet". By default, 0.0001, 0.001, or 0.01 is randomly chosen for the each of the networks. See nnet
trace
logical, TRUE switches on tracing the optimization of nnet (applies when method is "neuralnet").
maxit
numeric, the maximum number of iterations. Defaults to 500. See also nnet.
...
other arguments passed on from nnet.

Value

  • postpr returns an object of class "postpr". The function summary is used to obtain the model probabilities and Bayes factors between pairs of models. The returned value is an object of class "postpr", containing the following components:
  • probsa vector of model probabilities.
  • valuesthe vector of model indices in the accepted region using method="rejection".
  • weightsvector of regression weights.
  • ssvalues of the summary statistics in the accepted region, ie corresponding to values.
  • callthe original call
  • na.actionA logical vector indicating the elements or rows that were excluded, including both NA/NaN's and elements/rows selected by subset
  • methodCharacter string indicating the method used, i.e. either "rejection", "mnlogistic" or "neuralnet".
  • modelsa character vector of model names (a priori)

Details

A model with the highest posterior probability is selected from a finite subset of candidate models. Simulations have to be performed with at least two distinct models. When method is "rejection", the posterior probability of a given model is approximated by the proportion of accepted simulations given this model. This approximation holds when the different models are a priori equally likely, and the same number of simulations is performed for each model. When method is "mnlogistic" the posterior model probabilities are estimated using a multinomial logistic regression as implemented in the function multinom from the package nnet. When method is "neuralnet", neural networks are used to predict the probabilities of models based on the observed statistics using nnet. This method can be useful if one wishes to use many statistics. Names for the summary statistics are strongly recommended. Names can be supplied as colnames to sumstat (and target). If no names are supplied S1, S2, ...to summary statistics will be assigned to parameters and the user will be warned.

References

Beaumont, M.A. (2008) Joint determination of topology, divergence time, and immigration in population trees. In Simulation, Genetics, and Human Prehistory (Matsumura, S., Forster, P. and Renfrew, C., eds) McDonald Institute for Archaeological Research

See Also

summary.postpr, abc

Examples

Run this code
data(human)
## five R objects are loaded. See ?human for details.

## the two summary statistics: mean and variance of Tajima's D over 50
## loci
##
par(mfcol = c(1,2))
boxplot(tajima.sim[,1]~models, main=names(tajima.sim)[1])
boxplot(tajima.sim[,2]~models, main=names(tajima.sim)[2])

## model selection with ABC for the three populations, representing
## three continents
##
## in Africa, population expansion is the most supported model
africa <- postpr(tajima.obs["Hausa",], models, tajima.sim, tol=.01, method="mnlogistic")
summary(africa)

## in Europe and Asia, population bottleneck is the most supported model
europe <- postpr(tajima.obs["Italian",], models, tajima.sim, tol=.01, method="mnlogistic")
summary(europe)
asia <- postpr(tajima.obs["Chinese",], models, tajima.sim, tol=.01, method="mnlogistic")
summary(asia)

Run the code above in your browser using DataLab