Given a hyper2 object and a point in probability space,
function gradient() returns the gradient of the log-likelihood;
function hessian() returns the bordered Hessian matrix. By
default, both functions are evaluated at the maximum likelihood estimate
for \(p\), as given by maxp().
gradient(H, probs=indep(maxp(H)))
hessian(H,probs=indep(maxp(H)),border=TRUE)
hessian_lowlevel(L, powers, probs, pnames,n)
is_ok_hessian(M, give=TRUE)A hyper2 object
Components of a hyper2 object
A vector of probabilities
Character vector of names
Boolean, with default TRUE meaning to return the
bordered Hessian and FALSE meaning to return the Hessian
(warning: this option does not respect the unit sum constraint)
A bordered Hessian matrix, understood to have a single
constraint (the unit sum) at the last row and column; the output of
hessian(border=TRUE)
Boolean with default FALSE meaning for function
is_ok_hessian() to return whether or not M
corresponds to a negative-definite matrix, and TRUE meaning
to return more details
Function gradient() returns a vector of length \(n-1\) with
entries being the gradient of the log-likelihood with respect to the
\(n-1\) independent components of
(p_1,…,p_n)(p_1,...,p_n), namely
(p_1,…,p_n-1)(p_1,...,p_n-1). The fillup
value p_n is calculated as
1-(p_1,…,p_n-1)1-(p_1,...,p_n-1).
If argument border is TRUE, function hessian()
returns an \(n\)-by-\(n\) matrix of second derivatives; the borders
are as returned by gradient(). If border is FALSE,
ignore the fillup value and return an \(n-1\)-by-\(n-1\) matrix.
Calling hessian() at the evaluate will not return exact zeros for
the constraint on the fillup value; gradient() is used and this
does not return exactly zeros at the evaluate.
Function gradient() returns the gradient of the log-likelihood
function. If the hyper2 object is of size \(n\), then argument
probs may be a vector of length \(n-1\) or \(n\); in the
former case it is interpreted as indep(p). In both cases, the
returned gradient is a vector of length \(n-1\).
The function returns the derivative of the loglikelihood with respect to
the \(n-1\) independent components of
(p_1,…,p_n)(p_1,...,p_n), namely
(p_1,…,p_n-1)(p_1,...,p_n-1). The fillup
value p_n is calculated as
1-(p_1+ + p_n-1)1-(p_1+...+p_n-1).
Function gradientn() returns the gradient of the loglikelihood
function but ignores the unit sum constraint. If the hyper2
object is of size \(n\), then argument probs must be a vector
of length \(n\), and the function returns a named vector of length
\(n\). The last element of the vector is not treated differently from
the others; all \(n\) elements are treated as independent. The sum
need not equal one.
Function hessian() returns the bordered Hessian, a matrix
of size n+1 n+1(n+1)*(n+1), which is useful when using
Lagrange's method of undetermined multipliers. The first row and column
correspond to the unit sum constraint, p_1=1p_1+...+p_n=1.
Row and column names of the matrix are the pnames() of the
hyper2 object, plus “usc” for “Unit Sum
Constraint”.
The unit sum constraint borders could have been added with idiom
magic::adiag(0,pad=1,hess), which might be preferable.
Function is_ok_hessian() returns the result of the second
derivative test for the maximum likelihood estimate being a local
maximum on the constraint hypersurface. This is a generalization of the
usual unconstrained problem, for which the test is the Hessian's being
negative-definite.
Function hessian_lowlevel() is a low-level helper function that
calls the C++ routine.
Further examples and discussion is given in file
inst/gradient.Rmd.
# NOT RUN {
data(chess)
p <- c(1/2,1/3)
delta <- rnorm(2)/1e5 # delta needs to be quite small
deltaL <- loglik(p+delta,chess) - loglik(p,chess)
deltaLn <- sum(delta*gradient(chess,p + delta/2)) # numeric
deltaL - deltaLn # should be small [zero to first order]
H <- hessian(icons)
is_ok_hessian(H)
# }
Run the code above in your browser using DataLab