Last chance! 50% off unlimited learning
Sale ends in
Functionalities for fitting multivariate normal variance mixtures (in particular also Multivariate t distributions) via an ECME algorithm.
fitnvmix(x, qmix, mix.param.bounds, nu.init = NA, loc = NULL, scale = NULL,
init.size.subsample = min(n, 100), size.subsample = n,
control = list(), verbose = TRUE)fitStudent(x, loc = NULL, scale = NULL, mix.param.bounds = c(1e-3, 1e2), ...)
fitNorm(x)
The function fitnvmix()
returns an S3 object of
"fitnvmix"
, basically a list
which contains, among others, the components
nu
Estimated mixing parameter (vector) nu
.
loc
Estimated or provided loc
vector.
scale
Estimated or provided scale
matrix.
max.ll
Estimated log-likelihood at reported estimates.
x
Input data matrix x
.
The methods print()
, summary()
and plot()
are defined
for the class "fitnvmix"
.
fitStudent()
is a wrapper to fitnvmix()
for parameter
estimation of multivariate Student t distributions; it also
returns an S3 object of class
"fitnvmix"
where
the fitted degrees of freedom are called "df"
instead of
"nu"
(to be consistent
with the other wrappers for the Student t distributions).
fitNorm()
just returns a list
with components
loc
(columnwise sample means) and scale
(sample
covariance matrix).
matrix
.
specification of the mixing variable
character
:character
string
specifying a supported distribution; currently available are
"constant"
(in which case loc
and covariance matrix scale
results),
"inverse.gamma"
(in which case df
/2 and thus the multivariate
Student t distribution with df
degrees of freedom
results) and "pareto"
(in which case alpha
).
function
:function
interpreted as the quantile function of the mixing
variable qmix
must have the
form qmix = function(u, nu)
, where the argument nu
corresponds to the parameter (vector) specifying the distribution
of the mixing variable.
either numeric(2)
or a
matrix
with two columns. The first/second column
corresponds to the lower/upper bound of
either NA
or an initial guess for the parameter
(vector) nu.init
is provided, the
algorithm often converges faster; the better the starting value,
the faster convergence.
vector
; if provided, taken as the 'true'
location vector in which case loc
is not estimated.
positive definite matrix
; if provided,
taken as the 'true' scale matrix in which case scale
is not estimated.
numeric
, non-negative,
giving the sub-samplesize used to get an initial estimate for
is.na(nu.init)
, otherwise ignored.
numeric
, non-negative, specifying
the size of the subsample that is being used in the ECME iterations
to optimize the log-likelihood over n
,
so that the full sample is being used. Decreasing this number can
lead to faster run-times (as fewer log-densities need to be
estimated) but also to an increase in bias and variance.
list
specifying algorithm specific
parameters; see below under 'Details' and get_set_param()
.
numeric
or logical
(in
which case it is converted to numeric
) specifying the
amount of tracing to be done. If 0
or FALSE
, neither tracing
nor warnigns are communicated; if 1
, only warnigns are communicated,
if 2
or 3
, warnings and (shorter or longer) tracing information is
provided.
additional arguments passed to the underlying
fitnvmix()
.
Erik Hintz, Marius Hofert and Christiane Lemieux
The function fitnvmix()
uses an ECME algorithm to approximate the
MLEs of the parameters nu
, loc
and scale
of a
normal variance mixture specified by qmix
. The underlying
procedure successively estimates nu
(with given loc
and
scale
) by maximizing the likelihood which is approximated by
dnvmix()
(unless qmix
is a character
string, in which case analytical formulas for the log-densities are
used) and scale
and loc
(given nu
) using weights
(which again need to be approximated) related to the posterior
distribution, details can be found in the first reference below.
It should be highlighted that (unless unless qmix
is a
character
string), every log-likelihood and every weight needed
in the estimation is numerically approximated via RQMC methods. For
large dimensions and sample sizes this procedure can therefore be
slow.
Various tolerances and convergence criteria can be changed by the user
via the control
argument. For more details, see
get_set_param()
.
Hintz, E., Hofert, M. and Lemieux, C. (2021), Normal variance mixtures: Distribution, density and parameter estimation. Computational Statistics and Data Analysis 157C, 107175.
Hintz, E., Hofert, M. and Lemieux, C. (2022), Multivariate Normal Variance Mixtures in R: The R Package nvmix. Journal of Statistical Software, tools:::Rd_expr_doi("10.18637/jss.v102.i02").
McNeil, A. J., Frey, R. and Embrechts, P. (2015). Quantitative Risk Management: Concepts, Techniques, Tools. Princeton University Press.
Liu, C. and Rubin, D. (1994). The ECME algorithm: a simple extension of EM and ECM with faster monotone convergence. Biometrika 81(4), 633--648.
dnvmix()
, rnvmix()
, pnvmix()
,
qqplot_maha()
, get_set_param()
.
## Sampling parameters
set.seed(274) # for reproducibility
nu <- 2.8 # parameter used to sample data
d <- 4 # dimension
n <- 75 # small sample size to have examples run fast
loc <- rep(0, d) # location vector
A <- matrix(runif(d * d), ncol = d)
diag_vars <- diag(runif(d, min = 2, max = 5))
scale <- diag_vars %*% cov2cor(A %*% t(A)) %*% diag_vars # scale matrix
mix.param.bounds <- c(1, 5) # nu in [1, 5]
### Example 1: Fitting a multivariate t distribution ###########################
if(FALSE){
## Define 'qmix' as the quantile function of an IG(nu/2, nu/2) distribution
qmix <- function(u, nu) 1 / qgamma(1 - u, shape = nu/2, rate = nu/2)
## Sample data using 'rnvmix'
x <- rnvmix(n, qmix = qmix, nu = nu, loc = loc, scale = scale)
## Call 'fitvnmix' with 'qmix' as a function (so all densities/weights are estimated)
(MyFit11 <- fitnvmix(x, qmix = qmix, mix.param.bounds = mix.param.bounds))
## Call 'fitnvmix' with 'qmix = "inverse.gamma"' in which case analytical formulas
## for weights and densities are used:
(MyFit12 <- fitnvmix(x, qmix = "inverse.gamma",
mix.param.bounds = mix.param.bounds))
## Alternatively, use the wrapper 'fitStudent()'
(MyFit13 <- fitStudent(x))
## Check
stopifnot(all.equal(MyFit11$nu, MyFit12$nu, tol = 5e-2),
all.equal(MyFit11$nu, MyFit13$nu, tol = 5e-2))
## Can also provide 'loc' and 'scale' in which case only 'nu' is estimated
(MyFit13 <- fitnvmix(x, qmix = "inverse.gamma", mix.param.bounds = mix.param.bounds,
loc = loc, scale = scale))
(MyFit14 <- fitStudent(x, loc = loc, scale = scale))
stopifnot(all.equal(MyFit13$nu, MyFit14$df, tol = 1e-6))
}
### Example 2: Fitting a Pareto mixture ########################################
## Define 'qmix' as the quantile function of a Par(nu, 1) distribution
qmix <- function(u, nu) (1-u)^(-1/nu)
## Sample data using 'rnvmix':
x <- rnvmix(n, qmix = qmix, nu = nu, loc = loc, scale = scale)
## Call 'fitvnmix' with 'qmix' as function (=> densities/weights estimated)
(MyFit21 <- fitnvmix(x, qmix = qmix, mix.param.bounds = mix.param.bounds))
## Call 'fitnvmix' with 'qmix = "pareto"' in which case an analytical formula
## for the density is used
(MyFit22 <- fitnvmix(x, qmix = "pareto", mix.param.bounds = mix.param.bounds))
stopifnot(all.equal(MyFit21$nu, MyFit22$nu, tol = 5e-2))
Run the code above in your browser using DataLab