hyperg(N = NULL, D = NULL, lprob = "logit", iprob = NULL)
N
and D
must be specified.N
and D
must be specified.Links
for more choices.dhyper
where there
are $N=m+n$ balls in an urn, where $m$ are white and $n$
are black. A simple random sample (i.e., without replacement) of
$k$ balls is taken.
The response here is the sample proportion of white balls.
In this document,
N
is $N=m+n$,
D
is $m$ (for the number of ``defectives'', in quality
control terminology, or equivalently, the number of marked individuals).
The parameter to be estimated is the population proportion of
white balls, viz. $prob = m/(m+n)$.
Depending on which one of N
and D
is inputted, the
estimate of the other parameter can be obtained from the equation
$prob = m/(m+n)$, or equivalently, prob = D/N
. However,
the log-factorials are computed using lgamma
and both $m$ and $n$ are not restricted to being integer.
Thus if an integer $N$ is to be estimated, it will be necessary to
evaluate the likelihood function at integer values about the estimate,
i.e., at trunc(Nhat)
and ceiling(Nhat)
where Nhat
is the (real) estimate of $N$.
dhyper
,
binomialff
.nn <- 100
m <- 5 # Number of white balls in the population
k <- rep(4, len = nn) # Sample sizes
n <- 4 # Number of black balls in the population
y <- rhyper(nn = nn, m = m, n = n, k = k)
yprop <- y / k # Sample proportions
# N is unknown, D is known. Both models are equivalent:
fit <- vglm(cbind(y,k-y) ~ 1, hyperg(D = m), trace = TRUE, crit = "c")
fit <- vglm(yprop ~ 1, hyperg(D = m), weight = k, trace = TRUE, crit = "c")
# N is known, D is unknown. Both models are equivalent:
fit <- vglm(cbind(y, k-y) ~ 1, hyperg(N = m+n), trace = TRUE, crit = "l")
fit <- vglm(yprop ~ 1, hyperg(N = m+n), weight = k, trace = TRUE, crit = "l")
coef(fit, matrix = TRUE)
Coef(fit) # Should be equal to the true population proportion
unique(m / (m+n)) # The true population proportion
fit@extra
head(fitted(fit))
summary(fit)
Run the code above in your browser using DataLab