Learn R Programming

gkwreg (version 1.0.7)

grbkw: Gradient of the Negative Log-Likelihood for the BKw Distribution

Description

Computes the gradient vector (vector of first partial derivatives) of the negative log-likelihood function for the Beta-Kumaraswamy (BKw) distribution with parameters alpha (\(\alpha\)), beta (\(\beta\)), gamma (\(\gamma\)), and delta (\(\delta\)). This distribution is the special case of the Generalized Kumaraswamy (GKw) distribution where \(\lambda = 1\). The gradient is typically used in optimization algorithms for maximum likelihood estimation.

Computes the gradient vector (vector of first partial derivatives) of the negative log-likelihood function for the Beta-Kumaraswamy (BKw) distribution with parameters alpha (\(\alpha\)), beta (\(\beta\)), gamma (\(\gamma\)), and delta (\(\delta\)). This distribution is the special case of the Generalized Kumaraswamy (GKw) distribution where \(\lambda = 1\). The gradient is typically used in optimization algorithms for maximum likelihood estimation.

Usage

grbkw(par, data)

Value

Returns a numeric vector of length 4 containing the partial derivatives of the negative log-likelihood function \(-\ell(\theta | \mathbf{x})\) with respect to each parameter: \((-\partial \ell/\partial \alpha, -\partial \ell/\partial \beta, -\partial \ell/\partial \gamma, -\partial \ell/\partial \delta)\). Returns a vector of NaN if any parameter values are invalid according to their constraints, or if any value in data is not in the interval (0, 1).

Returns a numeric vector of length 4 containing the partial derivatives of the negative log-likelihood function \(-\ell(\theta | \mathbf{x})\) with respect to each parameter: \((-\partial \ell/\partial \alpha, -\partial \ell/\partial \beta, -\partial \ell/\partial \gamma, -\partial \ell/\partial \delta)\). Returns a vector of NaN if any parameter values are invalid according to their constraints, or if any value in data is not in the interval (0, 1).

Arguments

par

A numeric vector of length 4 containing the distribution parameters in the order: alpha (\(\alpha > 0\)), beta (\(\beta > 0\)), gamma (\(\gamma > 0\)), delta (\(\delta \ge 0\)).

data

A numeric vector of observations. All values must be strictly between 0 and 1 (exclusive).

Author

Lopes, J. E.

Details

The components of the gradient vector of the negative log-likelihood (\(-\nabla \ell(\theta | \mathbf{x})\)) for the BKw (\(\lambda=1\)) model are:

$$ -\frac{\partial \ell}{\partial \alpha} = -\frac{n}{\alpha} - \sum_{i=1}^{n}\ln(x_i) + \sum_{i=1}^{n}\left[x_i^{\alpha} \ln(x_i) \left(\frac{\beta(\delta+1)-1}{v_i} - \frac{(\gamma-1) \beta v_i^{\beta-1}}{w_i}\right)\right] $$ $$ -\frac{\partial \ell}{\partial \beta} = -\frac{n}{\beta} - (\delta+1)\sum_{i=1}^{n}\ln(v_i) + \sum_{i=1}^{n}\left[\frac{(\gamma-1) v_i^{\beta} \ln(v_i)}{w_i}\right] $$ $$ -\frac{\partial \ell}{\partial \gamma} = n[\psi(\gamma) - \psi(\gamma+\delta+1)] - \sum_{i=1}^{n}\ln(w_i) $$ $$ -\frac{\partial \ell}{\partial \delta} = n[\psi(\delta+1) - \psi(\gamma+\delta+1)] - \beta\sum_{i=1}^{n}\ln(v_i) $$

where:

  • \(v_i = 1 - x_i^{\alpha}\)

  • \(w_i = 1 - v_i^{\beta} = 1 - (1-x_i^{\alpha})^{\beta}\)

  • \(\psi(\cdot)\) is the digamma function (digamma).

These formulas represent the derivatives of \(-\ell(\theta)\), consistent with minimizing the negative log-likelihood. They correspond to the general GKw gradient (grgkw) components for \(\alpha, \beta, \gamma, \delta\) evaluated at \(\lambda=1\). Note that the component for \(\lambda\) is omitted. Numerical stability is maintained through careful implementation.

The components of the gradient vector of the negative log-likelihood (\(-\nabla \ell(\theta | \mathbf{x})\)) for the BKw (\(\lambda=1\)) model are:

$$ -\frac{\partial \ell}{\partial \alpha} = -\frac{n}{\alpha} - \sum_{i=1}^{n}\ln(x_i) + \sum_{i=1}^{n}\left[x_i^{\alpha} \ln(x_i) \left(\frac{\beta(\delta+1)-1}{v_i} - \frac{(\gamma-1) \beta v_i^{\beta-1}}{w_i}\right)\right] $$ $$ -\frac{\partial \ell}{\partial \beta} = -\frac{n}{\beta} - (\delta+1)\sum_{i=1}^{n}\ln(v_i) + \sum_{i=1}^{n}\left[\frac{(\gamma-1) v_i^{\beta} \ln(v_i)}{w_i}\right] $$ $$ -\frac{\partial \ell}{\partial \gamma} = n[\psi(\gamma) - \psi(\gamma+\delta+1)] - \sum_{i=1}^{n}\ln(w_i) $$ $$ -\frac{\partial \ell}{\partial \delta} = n[\psi(\delta+1) - \psi(\gamma+\delta+1)] - \beta\sum_{i=1}^{n}\ln(v_i) $$

where:

  • \(v_i = 1 - x_i^{\alpha}\)

  • \(w_i = 1 - v_i^{\beta} = 1 - (1-x_i^{\alpha})^{\beta}\)

  • \(\psi(\cdot)\) is the digamma function (digamma).

These formulas represent the derivatives of \(-\ell(\theta)\), consistent with minimizing the negative log-likelihood. They correspond to the general GKw gradient (grgkw) components for \(\alpha, \beta, \gamma, \delta\) evaluated at \(\lambda=1\). Note that the component for \(\lambda\) is omitted. Numerical stability is maintained through careful implementation.

References

Cordeiro, G. M., & de Castro, M. (2011). A new family of generalized distributions. Journal of Statistical Computation and Simulation,

Kumaraswamy, P. (1980). A generalized probability density function for double-bounded random processes. Journal of Hydrology, 46(1-2), 79-88.

(Note: Specific gradient formulas might be derived or sourced from additional references).

Cordeiro, G. M., & de Castro, M. (2011). A new family of generalized distributions. Journal of Statistical Computation and Simulation,

Kumaraswamy, P. (1980). A generalized probability density function for double-bounded random processes. Journal of Hydrology, 46(1-2), 79-88.

(Note: Specific gradient formulas might be derived or sourced from additional references).

See Also

grgkw (parent distribution gradient), llbkw (negative log-likelihood for BKw), hsbkw (Hessian for BKw, if available), dbkw (density for BKw), optim, grad (for numerical gradient comparison), digamma.

grgkw (parent distribution gradient), llbkw (negative log-likelihood for BKw), hsbkw (Hessian for BKw, if available), dbkw (density for BKw), optim, grad (for numerical gradient comparison), digamma.

Examples

Run this code
# \donttest{
# Assuming existence of rbkw, llbkw, grbkw, hsbkw functions for BKw

# Generate sample data
set.seed(123)
true_par_bkw <- c(alpha = 2, beta = 3, gamma = 1, delta = 0.5)
if (exists("rbkw")) {
  sample_data_bkw <- rbkw(100, alpha = true_par_bkw[1], beta = true_par_bkw[2],
                         gamma = true_par_bkw[3], delta = true_par_bkw[4])
} else {
  sample_data_bkw <- rgkw(100, alpha = true_par_bkw[1], beta = true_par_bkw[2],
                         gamma = true_par_bkw[3], delta = true_par_bkw[4], lambda = 1)
}
hist(sample_data_bkw, breaks = 20, main = "BKw(2, 3, 1, 0.5) Sample")

# --- Find MLE estimates ---
start_par_bkw <- c(1.5, 2.5, 0.8, 0.3)
mle_result_bkw <- stats::optim(par = start_par_bkw,
                               fn = llbkw,
                               gr = grbkw, # Use analytical gradient for BKw
                               method = "BFGS",
                               hessian = TRUE,
                               data = sample_data_bkw)

# --- Compare analytical gradient to numerical gradient ---
if (mle_result_bkw$convergence == 0 &&
    requireNamespace("numDeriv", quietly = TRUE)) {

  mle_par_bkw <- mle_result_bkw$par
  cat("\nComparing Gradients for BKw at MLE estimates:\n")

  # Numerical gradient of llbkw
  num_grad_bkw <- numDeriv::grad(func = llbkw, x = mle_par_bkw, data = sample_data_bkw)

  # Analytical gradient from grbkw
  ana_grad_bkw <- grbkw(par = mle_par_bkw, data = sample_data_bkw)

  cat("Numerical Gradient (BKw):\n")
  print(num_grad_bkw)
  cat("Analytical Gradient (BKw):\n")
  print(ana_grad_bkw)

  # Check differences
  cat("Max absolute difference between BKw gradients:\n")
  print(max(abs(num_grad_bkw - ana_grad_bkw)))

} else {
  cat("\nSkipping BKw gradient comparison.\n")
}

# --- Optional: Compare with relevant components of GKw gradient ---
# Requires grgkw function
if (mle_result_bkw$convergence == 0 && exists("grgkw")) {
  # Create 5-param vector for grgkw (insert lambda=1)
  mle_par_gkw_equiv <- c(mle_par_bkw[1:4], lambda = 1.0)
  ana_grad_gkw <- grgkw(par = mle_par_gkw_equiv, data = sample_data_bkw)
  # Extract components corresponding to alpha, beta, gamma, delta
  ana_grad_gkw_subset <- ana_grad_gkw[c(1, 2, 3, 4)]

  cat("\nComparison with relevant components of GKw gradient:\n")
  cat("Max absolute difference:\n")
  print(max(abs(ana_grad_bkw - ana_grad_gkw_subset))) # Should be very small
}

# }

# \donttest{
# Assuming existence of rbkw, llbkw, grbkw, hsbkw functions for BKw

# Generate sample data
set.seed(123)
true_par_bkw <- c(alpha = 2, beta = 3, gamma = 1, delta = 0.5)
if (exists("rbkw")) {
  sample_data_bkw <- rbkw(100, alpha = true_par_bkw[1], beta = true_par_bkw[2],
                         gamma = true_par_bkw[3], delta = true_par_bkw[4])
} else {
  sample_data_bkw <- rgkw(100, alpha = true_par_bkw[1], beta = true_par_bkw[2],
                         gamma = true_par_bkw[3], delta = true_par_bkw[4], lambda = 1)
}
hist(sample_data_bkw, breaks = 20, main = "BKw(2, 3, 1, 0.5) Sample")

# --- Find MLE estimates ---
start_par_bkw <- c(1.5, 2.5, 0.8, 0.3)
mle_result_bkw <- stats::optim(par = start_par_bkw,
                               fn = llbkw,
                               gr = grbkw, # Use analytical gradient for BKw
                               method = "BFGS",
                               hessian = TRUE,
                               data = sample_data_bkw)

# --- Compare analytical gradient to numerical gradient ---
if (mle_result_bkw$convergence == 0 &&
    requireNamespace("numDeriv", quietly = TRUE)) {

  mle_par_bkw <- mle_result_bkw$par
  cat("\nComparing Gradients for BKw at MLE estimates:\n")

  # Numerical gradient of llbkw
  num_grad_bkw <- numDeriv::grad(func = llbkw, x = mle_par_bkw, data = sample_data_bkw)

  # Analytical gradient from grbkw
  ana_grad_bkw <- grbkw(par = mle_par_bkw, data = sample_data_bkw)

  cat("Numerical Gradient (BKw):\n")
  print(num_grad_bkw)
  cat("Analytical Gradient (BKw):\n")
  print(ana_grad_bkw)

  # Check differences
  cat("Max absolute difference between BKw gradients:\n")
  print(max(abs(num_grad_bkw - ana_grad_bkw)))

} else {
  cat("\nSkipping BKw gradient comparison.\n")
}

# --- Optional: Compare with relevant components of GKw gradient ---
# Requires grgkw function
if (mle_result_bkw$convergence == 0 && exists("grgkw")) {
  # Create 5-param vector for grgkw (insert lambda=1)
  mle_par_gkw_equiv <- c(mle_par_bkw[1:4], lambda = 1.0)
  ana_grad_gkw <- grgkw(par = mle_par_gkw_equiv, data = sample_data_bkw)
  # Extract components corresponding to alpha, beta, gamma, delta
  ana_grad_gkw_subset <- ana_grad_gkw[c(1, 2, 3, 4)]

  cat("\nComparison with relevant components of GKw gradient:\n")
  cat("Max absolute difference:\n")
  print(max(abs(ana_grad_bkw - ana_grad_gkw_subset))) # Should be very small
}

# }

Run the code above in your browser using DataLab