fit_DKP: Fit a Dirichlet Kernel Process (DKP) Model

Description

Fits a DKP model for categorical or multinomial response data by locally smoothing observed counts to estimate latent Dirichlet parameters. The model constructs flexible latent probability surfaces by updating Dirichlet priors using kernel-weighted observations.

Usage

fit_DKP(
  X,
  Y,
  Xbounds = NULL,
  prior = c("noninformative", "fixed", "adaptive"),
  r0 = 2,
  p0 = colMeans(Y/rowSums(Y)),
  kernel = c("gaussian", "matern52", "matern32"),
  loss = c("brier", "log_loss"),
  n_multi_start = NULL,
  theta = NULL
)

Value

A list of class "DKP" representing the fitted DKP model, with the following components:

theta_opt: Optimized kernel hyperparameters (lengthscales).
kernel: Kernel function used, as a string.
loss: Loss function used for hyperparameter tuning.
loss_min: Minimum loss value achieved during kernel hyperparameter optimization. Set to NA if theta is user-specified.
X: Original (unnormalized) input matrix of size n × d.
Xnorm: Normalized input matrix scaled to \([0,1]^d\).
Xbounds: Matrix specifying normalization bounds for each input dimension.
Y: Observed multinomial counts of size n × q.
prior: Type of prior used.
r0: Prior precision parameter.
p0: Prior mean (for fixed priors).
alpha0: Prior Dirichlet parameters at each input location (scalar or matrix).
alpha_n: Posterior Dirichlet parameters after kernel smoothing.

Arguments

X: A numeric input matrix of size \(n \times d\), where each row corresponds to a covariate vector.
Y: Numeric matrix of observed multinomial counts, with dimension \(n \times q\).
Xbounds: Optional \(d \times 2\) matrix specifying the lower and upper bounds of each input dimension. Used to normalize inputs to \([0,1]^d\). If NULL, inputs are assumed to be pre-normalized, and default bounds \([0,1]^d\) are applied.
prior: Type of prior: "noninformative" (default), "fixed", or "adaptive".
r0: Global prior precision (used when prior = "fixed" or "adaptive").
p0: Global prior mean vector (used only when prior = "fixed"). Defaults to the empirical class proportions colMeans(Y / rowSums(Y)). Must have length equal to the number of categories \(q\).
kernel: Kernel function for local weighting: "gaussian" (default), "matern52", or "matern32".
loss: Loss function for kernel hyperparameter tuning: "brier" (default) or "log_loss".
n_multi_start: Number of random initializations for multi-start optimization. Default is \(10 \times d\).
theta: Optional. A positive scalar or numeric vector of length d specifying kernel lengthscale parameters directly. If NULL (default), lengthscales are optimized using multi-start L-BFGS-B to minimize the specified loss.

References

Zhao J, Qing K, Xu J (2025). BKP: An R Package for Beta Kernel Process Modeling. arXiv. https://doi.org/10.48550/arXiv.2508.10447.

Examples

Run this code

#-------------------------- 1D Example ---------------------------
set.seed(123)

# Define true class probability function (3-class)
true_pi_fun <- function(X) {
  p1 <- 1/(1+exp(-3*X))
  p2 <- (1 + exp(-X^2) * cos(10 * (1 - exp(-X)) / (1 + exp(-X)))) / 2
  return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1)))
}

n <- 30
Xbounds <- matrix(c(-2, 2), nrow = 1)
X <- tgp::lhs(n = n, rect = Xbounds)
true_pi <- true_pi_fun(X)
m <- sample(150, n, replace = TRUE)

# Generate multinomial responses
Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ])))

# Fit DKP model
model1 <- fit_DKP(X, Y, Xbounds = Xbounds)
print(model1)


#-------------------------- 2D Example ---------------------------
set.seed(123)

# Define latent function and transform to 3-class probabilities
true_pi_fun <- function(X) {
  if (is.null(nrow(X))) X <- matrix(X, nrow = 1)
  m <- 8.6928; s <- 2.4269
  x1 <- 4 * X[,1] - 2
  x2 <- 4 * X[,2] - 2
  a <- 1 + (x1 + x2 + 1)^2 *
    (19 - 14*x1 + 3*x1^2 - 14*x2 + 6*x1*x2 + 3*x2^2)
  b <- 30 + (2*x1 - 3*x2)^2 *
    (18 - 32*x1 + 12*x1^2 + 48*x2 - 36*x1*x2 + 27*x2^2)
  f <- (log(a*b)- m)/s
  p1 <- pnorm(f) # Transform to probability
  p2 <- sin(pi * X[,1]) * sin(pi * X[,2])
  return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1)))
}

n <- 100
Xbounds <- matrix(c(0, 0, 1, 1), nrow = 2)
X <- tgp::lhs(n = n, rect = Xbounds)
true_pi <- true_pi_fun(X)
m <- sample(150, n, replace = TRUE)

# Generate multinomial responses
Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ])))

# Fit DKP model
model2 <- fit_DKP(X, Y, Xbounds = Xbounds)
print(model2)