Estimate the mean of a Poisson distribution
, and
construct a prediction interval for the next
predIntPois(x, k = 1, n.sum = 1, method = "conditional",
pi.type = "two-sided", conf.level = 0.95, round.limits = TRUE)
numeric vector of observations, or an object resulting from a call to an
estimating function that assumes a Poisson distribution
(i.e., epois
or epoisCensored
).
If x
is a numeric vector,
missing (NA
), undefined (NaN
), and
infinite (Inf
, -Inf
) values are allowed but will be removed.
positive integer specifying the number of future observations or sums the
prediction interval should contain with confidence level conf.level
.
The default value is k=1
.
positive integer specifying the sample size associated with the n.sum=1
(i.e., individual observations).
Note that all future sums must be based on the same sample size.
character string specifying the method to use. The possible values are:
"conditional"
(based on a conditional distribution; the default),
"conditional.approx.normal"
(method based on approximating a conditional
distribution with the standard normal distribution),
"conditional.approx.t"
(method based on approximating a conditional
distribution with Student's t-distribution), and
"normal.approx"
(approximate method based on the fact that the
mean and varaince of a Poisson distribution are the same).
See the DETAILS section for more information on these methods. The
"conditional"
method
is only implemented for k=1
; when k
is bigger than 1, the value of
method
cannot be "conditional"
.
character string indicating what kind of prediction interval to compute.
The possible values are pi.type="two-sided"
(the default),
pi.type="lower"
, and pi.type="upper"
.
a scalar between 0 and 1 indicating the confidence level of the prediction interval.
The default value is conf.level=0.95
.
logical scalar indicating whether to round the computed prediction limits to the
nearest integer. The default value is round.limits=TRUE
.
If x
is a numeric vector, predIntPois
returns a list of class
"estimate"
containing the estimated parameter, the prediction interval,
and other information. See the help file for
estimate.object
for details.
If x
is the result of calling an estimation function,
predIntPois
returns a list whose class is the same as x
.
The list contains the same components as x
, as well as a component called
interval
containing the prediction interval information.
If x
already has a component called interval
, this component is
replaced with the prediction interval information.
A prediction interval for some population is an interval on the real line constructed so
that it will contain
In the case of a Poisson distribution, we have modified the
usual meaning of a prediction interval and instead construct an interval that will
contain
A prediction interval is a random interval; that is, the lower and/or
upper bounds are random variables computed based on sample statistics in the
baseline sample. Prior to taking one specific baseline sample, the probability
that the prediction interval will contain the next
If an experiment is repeated
A sample is taken and a
One future observation is generated and compared to the prediction interval,
then the number of prediction intervals that actually contain the future observation
generated in step 2 above is a binomial random variable with parameters
size=
prob=
If, on the other hand, only one baseline sample is taken and only one prediction
interval for size=
prob=
Because of the discrete nature of the Poisson distribution,
even if the true mean of the distribution lambda=2
, the interval [0, 4] contains 94.7% of this distribution and
the interval [0,5] contains 98.3% of this distribution. Thus, no interval can
contain exactly 95% of this distribution, so it is impossible to construct an
exact 95% prediction interval for the next lambda=2
.
The Form of a Poisson Prediction Interval
Let lambda=
n.sum=
Let lambda=
lambda=
For a Poisson distribution, the form of a two-sided prediction interval is:
predIntNorm
).
Similarly, the form of a one-sided lower prediction interval is:
Conditional Distribution (method="conditional"
)
Nelson (1970) derives a prediction interval for the case size=
prob=
prob=[m /(m + n)]
.
Using the relationship between the binomial and
F-distribution (see the explanation of exact confidence
intervals in the help file for ebinom
), Nelson (1982, p. 203) states
that exact two-sided
If ci.type="lower"
,
If ci.type="upper"
,
NOTE: This method is not extended to the case
Conditional Distribution Approximation Based on Normal Distribution
(method="conditional.approx.normal"
)
Cox and Hinkley (1974, p.245) derive an approximate prediction interval for the case
size=
prob=
prob=[m /(m + n)]
.
Cox and Hinkley (1974, p.245) suggest using the normal approximation to the
binomial distribution (in this case, without the continuity correction;
see Zar, 2010, pp.534-536 for information on the continuity correction associated
with the normal approximation to the binomial distribution). Under the null
hypothesis
The Case When k = 1
When pi.type="two-sided"
, the prediction limits are computed
by solving the equation
When pi.type="lower"
or pi.type="upper"
,
The Case When k > 1
When
When pi.type="two-side"
,
When pi.type="lower"
or pi.type="upper"
,
Conditional Distribution Approximation Based on Student's t-Distribution
(method="conditional.approx.t"
)
When method="conditional.approx.t"
, the exact same procedure is used as when
method="conditional.approx.normal"
, except that the quantity in
Equation (10) is assumed to follow a Student's t-distribution with
Normal Approximation (method="normal.approx"
)
The normal approximation for Poisson prediction limits was given by
Nelson (1970; 1982, p.203) and is based on the fact that the mean and variance of a
Poisson distribution are the same (Johnson et al, 1992, p.157), and for
“large” values of
The Case When k = 1
The quantity predIntPois
, however, assumes this quantity is distributed as approximately
a Student's t-distribution with
Thus, following the idea of prediction intervals for a normal distribution
(see predIntNorm
), when pi.type="two-sided"
, the constant
Similarly, when pi.type="lower"
or pi.type="upper"
, the constant
The Case When k > 1
When
When pi.type="two-sided"
,
When pi.type="lower"
or pi.type="upper"
,
Hahn and Nelson (1973, p.182) discuss another method of computing
Cox, D.R., and D.V. Hinkley. (1974). Theoretical Statistics. Chapman and Hall, New York, pp.242--245.
Gibbons, R.D. (1987b). Statistical Models for the Analysis of Volatile Organic Compounds in Waste Disposal Sites. Ground Water 25, 572--580.
Gibbons, R.D., D.K. Bhaumik, and S. Aryal. (2009). Statistical Methods for Groundwater Monitoring, Second Edition. John Wiley & Sons, Hoboken, pp. 72--76.
Hahn, G.J., and W.Q. Meeker. (1991). Statistical Intervals: A Guide for Practitioners. John Wiley and Sons, New York.
Hahn, G., and W. Nelson. (1973). A Survey of Prediction Intervals and Their Applications. Journal of Quality Technology 5, 178--188.
Johnson, N. L., S. Kotz, and A. Kemp. (1992). Univariate Discrete Distributions. Second Edition. John Wiley and Sons, New York, Chapter 4.
Millard, S.P., and N.K. Neerchal. (2001). Environmental Statistics with S-PLUS. CRC Press, Boca Raton.
Miller, R.G. (1981a). Simultaneous Statistical Inference. McGraw-Hill, New York, pp.8, 76--81.
Nelson, W.R. (1970). Confidence Intervals for the Ratio of Two Poisson Means and Poisson Predictor Intervals. IEEE Transactions of Reliability R-19, 42--49.
Nelson, W.R. (1982). Applied Life Data Analysis. John Wiley and Sons, New York, pp.200--204.
Zar, J.H. (2010). Biostatistical Analysis. Fifth Edition. Prentice-Hall, Upper Saddle River, NJ, pp. 585--586.
Poisson
, epois
,
estimate.object
, Prediction Intervals,
tolIntPois
, Estimating Distribution Parameters.
# NOT RUN {
# Generate 20 observations from a Poisson distribution with parameter
# lambda=2. The interval [0, 4] contains 94.7% of this distribution and
# the interval [0,5] contains 98.3% of this distribution. Thus, because
# of the discrete nature of the Poisson distribution, no interval contains
# exactly 95% of this distribution. Use predIntPois to estimate the mean
# parameter of the true distribution, and construct a one-sided upper
# 95% prediction interval for the next single observation from this distribution.
# (Note: the call to set.seed simply allows you to reproduce this example.)
set.seed(250)
dat <- rpois(20, lambda = 2)
predIntPois(dat, pi.type = "upper")
#Results of Distribution Parameter Estimation
#--------------------------------------------
#
#Assumed Distribution: Poisson
#
#Estimated Parameter(s): lambda = 1.8
#
#Estimation Method: mle/mme/mvue
#
#Data: dat
#
#Sample Size: 20
#
#Prediction Interval Method: conditional
#
#Prediction Interval Type: upper
#
#Confidence Level: 95%
#
#Number of Future Observations: 1
#
#Prediction Interval: LPL = 0
# UPL = 5
#----------
# Compare results above with the other approximation methods:
predIntPois(dat, method = "conditional.approx.normal",
pi.type = "upper")$interval$limits
#LPL UPL
# 0 4
predIntPois(dat, method = "conditional.approx.t",
pi.type = "upper")$interval$limits
#LPL UPL
# 0 4
predIntPois(dat, method = "normal.approx",
pi.type = "upper")$interval$limits
#LPL UPL
# 0 4
#Warning message:
#In predIntPois(dat, method = "normal.approx", pi.type = "upper") :
# Estimated value of 'lambda' and/or number of future observations
# is/are probably too small for the normal approximation to work well.
#==========
# Using the same data as in the previous example, compute a one-sided
# upper 95% prediction limit for k=10 future observations.
# Using conditional approximation method based on the normal distribution.
predIntPois(dat, k = 10, method = "conditional.approx.normal",
pi.type = "upper")
#Results of Distribution Parameter Estimation
#--------------------------------------------
#
#Assumed Distribution: Poisson
#
#Estimated Parameter(s): lambda = 1.8
#
#Estimation Method: mle/mme/mvue
#
#Data: dat
#
#Sample Size: 20
#
#Prediction Interval Method: conditional.approx.normal
#
#Prediction Interval Type: upper
#
#Confidence Level: 95%
#
#Number of Future Observations: 10
#
#Prediction Interval: LPL = 0
# UPL = 6
# Using method based on approximating conditional distribution with
# Student's t-distribution
predIntPois(dat, k = 10, method = "conditional.approx.t",
pi.type = "upper")$interval$limits
#LPL UPL
# 0 6
#==========
# Repeat the above example, but set k=5 and n.sum=3. Thus, we want a
# 95% upper prediction limit for the next 5 sets of sums of 3 observations.
predIntPois(dat, k = 5, n.sum = 3, method = "conditional.approx.t",
pi.type = "upper")$interval$limits
#LPL UPL
# 0 12
#==========
# Reproduce Example 3.6 in Gibbons et al. (2009, p. 75)
# A 32-constituent VOC scan was performed for n=16 upgradient
# samples and there were 5 detections out of these 16. We
# want to construct a one-sided upper 95% prediction limit
# for 20 monitoring wells (so k=20 future observations) based
# on these data.
# First we need to create a data set that will yield a mean
# of 5/16 based on a sample size of 16. Any number of data
# sets will do. Here are two possible ones:
dat <- c(rep(1, 5), rep(0, 11))
dat <- c(2, rep(1, 3), rep(0, 12))
# Now call predIntPois. Don't round the limits so we can
# compare to the example in Gibbons et al. (2009).
predIntPois(dat, k = 20, method = "conditional.approx.t",
pi.type = "upper", round.limits = FALSE)
#Results of Distribution Parameter Estimation
#--------------------------------------------
#
#Assumed Distribution: Poisson
#
#Estimated Parameter(s): lambda = 0.3125
#
#Estimation Method: mle/mme/mvue
#
#Data: dat
#
#Sample Size: 16
#
#Prediction Interval Method: conditional.approx.t
#
#Prediction Interval Type: upper
#
#Confidence Level: 95%
#
#Number of Future Observations: 20
#
#Prediction Interval: LPL = 0.000000
# UPL = 2.573258
#==========
# Cleanup
#--------
rm(dat)
# }
Run the code above in your browser using DataLab