Maximum Approximate Bernstein Likelihood Estimate of Multivariate Density Function
mable.mvar(
x,
M0 = 1,
M,
search = TRUE,
interval = NULL,
mar.deg = TRUE,
high.dim = FALSE,
criterion = c("cdf", "pdf"),
controls = mable.ctrl(),
progress = TRUE
)A list with components
m a vector of the selected optimal degrees by the method of change-point
p a vector of the mixture proportions \(p(j_1, \ldots, j_d)\), arranged in the
column-major order of \(j = (j_1, \ldots, j_d)\), \(0 \le j_i \le m_i, i = 1, \ldots, d\).
mloglik the maximum log-likelihood at an optimal degree m
pval the p-values of change-points for choosing the optimal degrees for the
marginal densities
M the vector (m1, m2, ... , md), where mi is the largest candidate
degree when the search stoped for the i-th marginal density
interval support hyperrectangle \([a, b]=[a_1, b_1] \times \cdots \times [a_d, b_d]\)
convergence An integer code. 0 indicates successful completion(the EM iteration is
convergent). 1 indicates that the iteration limit maxit had been reached in the EM iteration;
an n x d matrix or data.frame of multivariate sample of size n
a positive integer or a vector of d positive integers specify
starting candidate degrees for searching optimal degrees.
a positive integer or a vector of d positive integers specify
the maximum candidate or the given model degrees for the joint density.
logical, whether to search optimal degrees between M0 and M
or not but use M as the given model degrees for the joint density.
a vector of two endpoints or a 2 x d matrix, each column containing
the endpoints of support/truncation interval for each marginal density.
If missing, the i-th column is assigned as c(min(x[,i]), max(x[,i])).
logical, if TRUE, the optimal degrees are selected based on marginal data, otherwise, the optimal degrees are those minimize the maximum L2 distance between marginal cdf or pdf estimated based on marginal data and the joint data. See details.
logical, data are high dimensional/large sample or not if TRUE, run a slower version procedure which requires less memory
either cdf or pdf should be used for selecting optimal degrees. Default is "cdf"
Object of class mable.ctrl() specifying iteration limit
and the convergence criterion eps. Default is mable.ctrl. See Details.
if TRUE a text progressbar is displayed
Zhong Guan <zguan@iusb.edu>
A \(d\)-variate density \(f\) on a hyperrectangle \([a, b]
=[a_1, b_1] \times \cdots \times [a_d, b_d]\) can be approximated
by a mixture of \(d\)-variate beta densities on \([a, b]\),
\(\beta_{mj}(x) = \prod_{i=1}^d\beta_{m_i,j_i}[(x_i-a_i)/(b_i-a_i)]/(b_i-a_i)\),
with proportion \(p(j_1, \ldots, j_d)\), \(0 \le j_i \le m_i, i = 1, \ldots, d\).
Let \(\tilde F_i\) (\(\tilde f_i\)) be an estimate with degree \(\tilde m_i\) of
the i-th marginal cdf (pdf) based on marginal data x[,i], \(i=1, \ldots, d\).
If search=TRUE and use.marginal=TRUE, then the optimal degrees
are \((\tilde m_1,\ldots,\tilde m_d)\). If search=TRUE and
use.marginal=FALSE, then the optimal degrees \((\hat m_1,\ldots,\hat m_d)\)
are those that minimize the maximum of \(L_2\)-distance between
\(\tilde F_i\) (\(\tilde f_i\)) and the estimate of \(F_i\) (\(f_i\))
based on the joint data with degrees \(m=(m_1,\ldots,m_d)\) for all \(m\)
between \(M_0\) and \(M\) if criterion="cdf" (criterion="pdf").
For large data and multimodal density, the search for the model degrees is
very time-consuming. In this case, it is suggested that the degrees are selected
based on marginal data using mable or optimable.
Wang, T. and Guan, Z.,(2019) Bernstein Polynomial Model for Nonparametric Multivariate Density, Statistics, Vol. 53, no. 2, 321-338
mable, optimable
## Old Faithful Data
# \donttest{
a<-c(0, 40); b<-c(7, 110)
ans<- mable.mvar(faithful, M = c(46,19), search =FALSE,
interval = rbind(a,b), progress=FALSE)
plot(ans, which="density")
plot(ans, which="cumulative")
# }
Run the code above in your browser using DataLab