Learn R Programming

cassandRa (version 0.1.0)

CoverageEstimator: Coverage Estimator, using Chao1 Index, Turing-Good or Binomial depending on what is possible

Description

An estimate of the sample coverage, which tries to use the most appropriate method

Usage

CoverageEstimator(x, cutoff = 5, BayesPrior = "Flat")

Arguments

x

A vector of integers, the observed sample counts

cutoff

When to switch from binomial model to Chao1 estimator

BayesPrior

Prior to use. Either 'Flat' or 'Jeffereys'.

Value

c_hat, the estimated coverage. (i.e. 1- C_def)

Details

Sample coverage is defined as the probability that the next interaction drawn is of a type not yet seen

If the sample size is at or below a cutoff (5) or if all the samples are singletons, this is calculated as the posterior mean of a binomial model using a flat prior (this can be changed to a Jeffereys).

If there are singletons but no doubletons, the Turing-Good estimate is used: c_hat = 1 - (f1/n)

If there are both singletons and doubletons, the Chao1 index is used:

c_hat = 1 -( (f1/n) * ( (f1*(n-1))/((n-1)*(f1+(2*f2))) ) )