
demp(x, obs, discrete = FALSE, density.arg.list = NULL)
pemp(q, obs, discrete = FALSE,
prob.method = ifelse(discrete, "emp.probs", "plot.pos"),
plot.pos.con = 0.375)
qemp(p, obs, discrete = FALSE,
prob.method = ifelse(discrete, "emp.probs", "plot.pos"),
plot.pos.con = 0.375)
remp(n, obs)
length(n)
is larger than 1, then length(n)
random values are returned.NA
), undefined (NaN
), and
infinite (Inf
, -Inf
) values are allowed but will be removed.x
is
discrete (discrete=TRUE
) or continuous (discrete=FALSE
). The
default value is FALSE
."emp.probs"
(empirical probabilities,
default if discrete=TRUE
) and "plot.pos"
(plotting positions, plot.pos.con=0.375
. See the DETAILS
section for more information. This argument is ignored if
prob.method="emp.pro
demp
), probability (pemp
), quantile (qemp
), or
random sample (remp
) for the empirical distribution based on the data
contained in the vector obs
.obs
), and let $x_{(i)}$ denote the $i^{th}$ order statistic, that is,
the $i^{th}$ largest observation, for $i = 1, 2, \ldots, n$.
Estimating Density
The function demp
computes the empirical probability density function. If
the observations are assumed to come from a discrete distribution, the probability
density (mass) function is estimated by:
demp
calls the Rfunction density
to compute the
estimated density based on the values specified in the argument obs
,
and then uses linear interpolation to estimate the density at the values
specified in the argument x
. See the Rhelp file for
density
for more information on how the empirical density is
computed in the continuous case.
Estimating Probabilities
The function pemp
computes the estimated cumulative distribution function
(cdf), also called the empirical cdf (ecdf). If the observations are assumed to
come from a discrete distribution, the value of the cdf evaluated at the $i^{th}$
order statistic is usually estimated by:
pemp
uses the above equations to compute the empirical cdf when
prob.method="emp.probs"
.
For any general value of $x$, when the observations are assumed to come from a
discrete distribution, the value of the cdf is estimated by:
pemp
uses the above equation when discrete=TRUE
.
If the observations are assumed to come from a continuous distribution, the value
of the cdf evaluated at the $i^{th}$ order statistic is usually estimated by:
pemp
uses the above equation
when prob.method="plot.pos"
.
For any general value of $x$, the value of the cdf is estimated by linear
interpolation:
pemp
uses the above two equations
when discrete=FALSE
.
Estimating Quantiles
The function qemp
computes the estimated quantiles based on the observed
data. If the observations are assumed to come from a discrete distribution, the
$p^{th}$ quantile is usually estimated by:
qemp
uses the above equation when discrete=TRUE
.
If the observations are assumed to come from a continuous distribution, the
$p^{th}$ quantile is usually estimated by linear interpolation:
qemp
uses the above two equations when discrete=FALSE
.
Generating Random Numbers From the Empirical Distribution
The function remp
simply calls the Rfunction sample
to
sample the elements of obs
with replacement.density
, approx
, epdfPlot
,
ecdfPlot
, cdfCompare
, qqplot
,
eqnpar
, quantile
, sample
,
simulateVector
, simulateMvMatrix
.# Create a set of 100 observations from a gamma distribution with
# parameters shape=4 and scale=5.
# (Note: the call to set.seed simply allows you to reproduce this example.)
set.seed(3)
obs <- rgamma(100, shape=4, scale=5)
# Now plot the empirical distribution (with a histogram) and the true distribution:
dev.new()
hist(obs, col = "cyan", xlim = c(0, 65), freq = FALSE,
ylab = "Relative Frequency")
pdfPlot('gamma', list(shape = 4, scale = 5), add = TRUE)
box()
# Now plot the empirical distribution (based on demp) with the
# true distribution:
x <- qemp(p = seq(0, 1, len = 100), obs = obs)
y <- demp(x, obs)
dev.new()
plot(x, y, xlim = c(0, 65), type = "n",
xlab = "Value of Random Variable",
ylab = "Relative Frequency")
lines(x, y, lwd = 2, col = "cyan")
pdfPlot('gamma', list(shape = 4, scale = 5), add = TRUE)
# Alternatively, you can create the above plot with the function
# epdfPlot:
dev.new()
epdfPlot(obs, xlim = c(0, 65), epdf.col = "cyan",
xlab = "Value of Random Variable",
main = "Empirical and Theoretical PDFs")
pdfPlot('gamma', list(shape = 4, scale = 5), add = TRUE)
# Clean Up
#---------
rm(obs, x, y)
Run the code above in your browser using DataLab