
Simulate a vector of random numbers from a specified theoretical probability distribution or empirical probability distribution, using either Latin Hypercube sampling or simple random sampling.
simulateVector(n, distribution = "norm", param.list = list(mean = 0, sd = 1),
sample.method = "SRS", seed = NULL, sorted = FALSE,
left.tail.cutoff = ifelse(is.finite(supp.min), 0, .Machine$double.eps),
right.tail.cutoff = ifelse(is.finite(supp.max), 0, .Machine$double.eps))
a positive integer indicating the number of random numbers to generate.
a character string denoting the distribution abbreviation. The default value is
distribution="norm"
. See the help file for Distribution.df
for a list of possible distribution abbreviations.
Alternatively, the character string "emp"
may be used to denote sampling
from an empirical distribution based on a set of observations. The vector
containing the observations is specified in the argument param.list
.
a list with values for the parameters of the distribution.
The default value is param.list=list(mean=0, sd=1)
.
See the help file for Distribution.df
for the names and
possible values of the parameters associated with each distribution.
Alternatively, if you specify an empirical distribution by setting
distribution="emp"
, then param.list
must be a list of the
form list(obs=
name)
, where name denotes the
name of the vector containing the observations to use for the empirical
distribution. In this case, you may also supply arguments to the
qemp
function through param.list
. For example, you
may set
param.list=list(obs=
name, discrete=T)
to
specify an empirical distribution based on a discrete random variable.
a character string indicating whether to use simple random sampling
(sample.method="SRS"
, the default) or
Latin Hypercube sampling
(sample.method="LHS"
).
integer to supply to the R function set.seed
.
The default value is seed=NULL
, in which case the random seed is
not set but instead based on the current value of .Random.seed
.
logical scalar indicating whether to return the random numbers in sorted
(ascending) order. The default value is sorted=FALSE
.
a scalar between 0 and 1 indicating what proportion of the left-tail of
the probability distribution to omit for Latin Hypercube sampling.
For densities with a finite support minimum (e.g., Lognormal or
Empirical) the default value is left.tail.cutoff=0
;
for densities with a support minimum of left.tail.cutoff=.Machine$double.eps
.
This argument is ignored if sample.method="SRS"
.
a scalar between 0 and 1 indicating what proportion of the right-tail of
the probability distribution to omit for Latin Hypercube sampling.
For densities with a finite support maximum (e.g., Beta or
Empirical) the default value is right.tail.cutoff=0
;
for densities with a support maximum of right.tail.cutoff=.Machine$double.eps
.
This argument is ignored if sample.method="SRS"
.
a numeric vector of random numbers from the specified distribution.
Simple Random Sampling (sample.method="SRS"
)
When sample.method="SRS"
, the function simulateVector
simply
calls the function r
abb, where abb denotes the
abbreviation of the specified distribution (e.g., rlnorm
,
remp
, etc.).
Latin Hypercube Sampling (sample.method="LHS"
)
When sample.method="LHS"
, the function simulateVector
generates
n
random numbers using Latin Hypercube sampling. The distribution is
divided into n
intervals of equal probability
Latin Hypercube sampling, sometimes abbreviated LHS,
is a method of sampling from a probability distribution that ensures all
portions of the probability distribution are represented in the sample.
It was introduced in the published literature by McKay et al. (1979) to overcome
the following problem in Monte Carlo simulation based on simple random sampling
(SRS). Suppose we want to generate random numbers from a specified distribution.
If we use simple random sampling, there is a low probability of getting very many
observations in an area of low probability of the distribution. For example, if
we generate
See Millard (2013) for a visual explanation of Latin Hypercube sampling.
Iman, R.L., and W.J. Conover. (1980). Small Sample Sensitivity Analysis Techniques for Computer Models, With an Application to Risk Assessment (with Comments). Communications in Statistics--Volume A, Theory and Methods, 9(17), 1749--1874.
Iman, R.L., and J.C. Helton. (1988). An Investigation of Uncertainty and Sensitivity Analysis Techniques for Computer Models. Risk Analysis 8(1), 71--90.
Iman, R.L. and J.C. Helton. (1991). The Repeatability of Uncertainty and Sensitivity Analyses for Complex Probabilistic Risk Assessments. Risk Analysis 11(4), 591--606.
McKay, M.D., R.J. Beckman., and W.J. Conover. (1979). A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output From a Computer Code. Technometrics 21(2), 239--245.
Millard, S.P. (2013). EnvStats: an R Package for Environmental Statistics. Springer, New York. http://www.springer.com/book/9781461484554.
Vose, D. (2008). Risk Analysis: A Quantitative Guide. Third Edition. John Wiley & Sons, West Sussex, UK, 752 pp.
Probability Distributions and Random Numbers, Empirical,
simulateMvMatrix
, set.seed
.
# NOT RUN {
# Generate 10 observations from a lognormal distribution with
# parameters mean=10 and cv=1 using simple random sampling:
simulateVector(10, distribution = "lnormAlt",
param.list = list(mean = 10, cv = 1), seed = 47,
sort = TRUE)
# [1] 2.086931 2.863589 3.112866 5.592502 5.732602 7.160707
# [7] 7.741327 8.251306 12.782493 37.214748
#----------
# Repeat the above example by calling rlnormAlt directly:
set.seed(47)
sort(rlnormAlt(10, mean = 10, cv = 1))
# [1] 2.086931 2.863589 3.112866 5.592502 5.732602 7.160707
# [7] 7.741327 8.251306 12.782493 37.214748
#----------
# Now generate 10 observations from the same lognormal distribution
# but use Latin Hypercube sampling. Note that the largest value
# is larger than for simple random sampling:
simulateVector(10, distribution = "lnormAlt",
param.list = list(mean = 10, cv = 1), seed = 47,
sample.method = "LHS", sort = TRUE)
# [1] 2.406149 2.848428 4.311175 5.510171 6.467852 8.174608
# [7] 9.506874 12.298185 17.022151 53.552699
#==========
# Generate 50 observations from a Pareto distribution with parameters
# location=10 and shape=2, then use this resulting vector of
# observations as the basis for generating 3 observations from an
# empirical distribution using Latin Hypercube sampling:
set.seed(321)
pareto.rns <- rpareto(50, location = 10, shape = 2)
simulateVector(3, distribution = "emp",
param.list = list(obs = pareto.rns), sample.method = "LHS")
#[1] 11.50685 13.50962 17.47335
#==========
# Clean up
#---------
rm(pareto.rns)
# }
Run the code above in your browser using DataLab