Learn R Programming

gkwreg: Generalized Kumaraswamy Regression Models

Overview

The gkwreg package provides a robust and efficient framework for modeling data restricted to the standard unit interval $(0, 1)$, such as proportions, rates, fractions, or indices. While the Beta distribution is commonly used for such data, gkwreg focuses on the Generalized Kumaraswamy (GKw) distribution family, offering enhanced flexibility by encompassing several important bounded distributions (including Beta and Kumaraswamy) as special cases.

The package facilitates both distribution fitting and regression modeling with potentially all distribution parameters modeled as functions of covariates using various link functions. Estimation is performed efficiently via Maximum Likelihood leveraging the Template Model Builder (TMB) framework, which utilizes automatic differentiation for superior speed, accuracy, and stability.

Key Features

  • Flexible Distribution Family: Model data using the 5-parameter Generalized Kumaraswamy (GKw) distribution and its seven key nested sub-families:

    DistributionCodeParameters ModeledFixed Parameters# Par.
    Generalized Kumaraswamygkwalpha, beta, gamma, delta, lambdaNone5
    Beta-Kumaraswamybkwalpha, beta, gamma, deltalambda = 14
    Kumaraswamy-Kumaraswamykkwalpha, beta, delta, lambdagamma = 14
    Exponentiated Kumaraswamyekwalpha, beta, lambdagamma = 1, delta = 03
    McDonald / Beta Powermcgamma, delta, lambdaalpha = 1, beta = 13
    Kumaraswamykwalpha, betagamma = 1, delta = 0, lambda = 12
    Betabetagamma, deltaalpha = 1, beta = 1, lambda = 12
  • Advanced Regression Modeling (gkwreg): Independently model each relevant distribution parameter as a function of covariates using a flexible formula interface:

    y ~ alpha_terms | beta_terms | gamma_terms | delta_terms | lambda_terms
  • Multiple Link Functions: Choose appropriate link functions for each parameter, including:

    • log (default for all parameters)
    • logit, probit, cloglog (with optional scaling)
    • identity, inverse, sqrt
  • Efficient Estimation: Utilizes the TMB package for fast and stable Maximum Likelihood Estimation, leveraging automatic differentiation for precise gradient and Hessian calculations.

  • Standard R Interface: Provides familiar methods like summary(), predict(), plot(), coef(), vcov(), logLik(), AIC(), BIC(), residuals() for model inspection, inference, and diagnostics.

  • Distribution Utilities: Implements standard d*, p*, q*, r* also as analytical log-likelihood ll*, gradient gr* and hessian hs* functions for all supported distributions in C++/RcppArmadillo.

Installation

# Install the stable version from CRAN:
install.packages("gkwreg")

# Or install the development version from GitHub:
# install.packages("devtools")
devtools::install_github("evandeilton/gkwreg")

Mathematical Background

The Generalized Kumaraswamy (GKw) Distribution

The GKw distribution is a flexible five-parameter distribution for variables on $(0, 1)$. Its cumulative distribution function (CDF) is given by:

$$F(x; \alpha, \beta, \gamma, \delta, \lambda) = I_{[1-(1-x^{\alpha})^{\beta}]^{\lambda}}(\gamma, \delta)$$

where $I_z(a,b)$ is the regularized incomplete beta function, and $\alpha, \beta, \gamma, \delta, \lambda > 0$ are the distribution parameters. The corresponding probability density function (PDF) is:

$$f(x; \alpha, \beta, \gamma, \delta, \lambda) = \frac{\lambda \alpha \beta x^{\alpha-1}}{B(\gamma, \delta)} (1-x^{\alpha})^{\beta-1} [1-(1-x^{\alpha})^{\beta}]^{\gamma\lambda-1} {1-[1-(1-x^{\alpha})^{\beta}]^{\lambda}}^{\delta-1}$$

where $B(\gamma, \delta)$ is the beta function.

The five parameters collectively provide exceptional flexibility in modeling distributions on $(0, 1)$: - Parameters alpha and beta primarily govern the basic shape inherited from the Kumaraswamy distribution - Parameters gamma and delta affect tail behavior and concentration around modes - Parameter lambda introduces additional flexibility, influencing skewness and peak characteristics

This parameterization allows the GKw distribution to capture a wide spectrum of shapes, including symmetric, skewed, unimodal, bimodal, J-shaped, U-shaped, and bathtub-shaped forms.

Regression Framework

In the regression setting, we assume that the response variable $y_i \in (0,1)$ follows a distribution from the GKw family with parameters $\theta_i = (\alpha_i, \beta_i, \gamma_i, \delta_i, \lambda_i)^{\top}$. Each parameter $\theta_{ip}$ (where $p \in {$alpha, beta, gamma, delta, lambda$}$) can depend on covariates through a link function $g_p(\cdot)$:

$$g_p(\theta_{ip}) = \eta_{ip} = \mathbf{x}_{ip}^{\top}\boldsymbol{\beta}_p$$

where $\eta_{ip}$ is the linear predictor, and $\boldsymbol{\beta}p$ is the vector of regression coefficients. Equivalently, $\theta{ip} = g_p^{-1}(\eta_{ip})$. The default link function is log for all parameters, ensuring the positivity constraint.

Parameters are estimated using maximum likelihood, with the log-likelihood function:

$$\ell(\Theta; \mathbf{y}, \mathbf{X}) = \sum_{i=1}^{n} \log f(y_i; \theta_i)$$

where each parameter $\theta_{ip}$ depends on $\Theta$ (the complete set of regression coefficients) via the link functions and linear predictors.

Computational Engine: TMB

The package uses Template Model Builder (TMB) (Kristensen et al. 2016) as its computational backend. TMB translates the statistical model into C++ templates and uses Automatic Differentiation (AD) to compute exact gradients and Hessians, providing several advantages:

  • Speed: AD combined with compiled C++ is significantly faster than numerical differentiation or pure R implementations
  • Accuracy: AD provides derivatives accurate to machine precision
  • Stability: Precise derivatives improve optimization stability and convergence reliability
  • Scalability: Efficiently handles models with many parameters

Examples

Regression Modeling

Model parameters of a GKw family distribution as functions of covariates:

library(gkwreg)

# Simulate data for a Kumaraswamy regression model
set.seed(123)
n <- 100
x1 <- runif(n, -2, 2)
x2 <- rnorm(n)

# Simulate true parameters (using log link)
alpha_true <- exp(0.8 + 0.3 * x1 - 0.2 * x2) 
beta_true  <- exp(1.2 - 0.4 * x1 + 0.1 * x2)

# Generate response
y <- rkw(n, alpha = alpha_true, beta = beta_true)
y <- pmax(pmin(y, 1 - 1e-7), 1e-7)  # Ensure y in (0, 1)
df1 <- data.frame(y = y, x1 = x1, x2 = x2)

# Fit Kumaraswamy regression: alpha ~ x1 + x2, beta ~ x1 + x2
kw_model <- gkwreg(y ~ x1 + x2 | x1 + x2, data = df1, family = "kw")
summary(kw_model)

Real Data Analysis

# Food Expenditure Data
library(gkwreg)
data("FoodExpenditure", package = "betareg")
FoodExpenditure$y <- FoodExpenditure$food/FoodExpenditure$income

# Fit models from different GKw families
kkw_model <- gkwreg(y ~ income, data = FoodExpenditure, family = "kkw")
ekw_model <- gkwreg(y ~ income, data = FoodExpenditure, family = "ekw")
kw_model <- gkwreg(y ~ income, data = FoodExpenditure, family = "kw")

# Compare models
data.frame(
  logLik = rbind(logLik(kkw_model), logLik(ekw_model), logLik(kw_model)),
  AIC = rbind(AIC(kkw_model), AIC(ekw_model), AIC(kw_model)),
  BIC = rbind(BIC(kkw_model), BIC(ekw_model), BIC(kw_model))
)

# Summary
summary(kw_model)

res <- residuals(kw_model, type = "quantile")

# Visual diagnostics
plot(kw_model)

# Predicted
pred <- predict(kw_model)

Distribution Fitting

Fit a GKw family distribution to univariate data (no covariates):

# Simulate data from Beta(2, 3)
set.seed(2203)
y_beta <- rbeta_(1000, gamma = 2, delta = 3)

# Fit Beta and Kumaraswamy distributions
fit_beta <- gkwfit(data = y_beta, family = "beta")
fit_kw <- gkwfit(data = y_beta, family = "kw")

# Compare models
summary(fit_beta)
summary(fit_kw)
AIC(fit_beta)
AIC(fit_kw)

Diagnostic Methods

The package provides several diagnostic tools for model assessment:

# Residual analysis
model <- gkwreg(y ~ x1 | x2, data = mydata, family = "kw")
res <- residuals(model, type = "quantile")  # Randomized quantile residuals

# Visual diagnostics
plot(model)  # QQ-plot, residuals vs. fitted, etc.

pred <- predict(model, type = "response")

References

  • Cordeiro, G. M., & de Castro, M. (2011). A new family of generalized distributions. Journal of Statistical Computation and Simulation, 81(7), 883-898.

  • Carrasco, J. M. F., Ferrari, S. L. P., & Cordeiro, G. M. (2010). A new generalized Kumaraswamy distribution. arXiv preprint arXiv:1004.0911.

  • Jones, M. C. (2009). Kumaraswamy’s distribution: A beta-type distribution with some tractability advantages. Statistical Methodology, 6(1), 70-81.

  • Kristensen, K., Nielsen, A., Berg, C. W., Skaug, H., & Bell, B. M. (2016). TMB: Automatic Differentiation and Laplace Approximation. Journal of Statistical Software, 70(5), 1-21.

  • Kumaraswamy, P. (1980). A generalized probability density function for double-bounded random processes. Journal of Hydrology, 46(1-2), 79-88.

  • Ferrari, S. L. P., & Cribari-Neto, F. (2004). Beta regression for modelling rates and proportions. Journal of Applied Statistics, 31(7), 799-815.

  • Cribari-Neto, F., & Zeileis, A. (2010). Beta Regression in R. Journal of Statistical Software, 34(2), 1-24.

  • Lopes, J. E. (2025). Generalized Kumaraswamy Regression Models with gkwreg. Journal of Statistical Software, forthcoming.

Comparing with Other Packages

The gkwreg package complements and extends existing approaches for modeling bounded data:

Featuregkwregbetareggamlssbrms
Distribution FamilyGKw hierarchy (7 distributions)BetaMultipleMultiple
Estimation MethodMLE via TMBMLEMLE/GAMLSSBayesian
Parameter ModelingAll parametersMean, precisionAll parametersAll parameters
Computation SpeedFast (TMB + AD)FastModerateSlow (MCMC)
Default Linkloglogit (mean)Distribution-specificDistribution-specific
Random EffectsNoNoYesYes

Contributing

Contributions to gkwreg are welcome! Please feel free to submit issues or pull requests on the GitHub repository.

License

This package is licensed under the MIT License. See the LICENSE file for details.

Author and Maintainer

Lopes, J. E. (evandeilton@gmail.com)
LEG - Laboratório de Estatística e Geoinformação
UFPR - Universidade Federal do Paraná, Brazil

Copy Link

Version

Install

install.packages('gkwreg')

Version

1.0.7

License

MIT + file LICENSE

Maintainer

Lopes J. E.

Last Published

May 1st, 2025

Functions in gkwreg (1.0.7)

calculateScoreResiduals

Calculate Score Residuals
calculateResponseResiduals

Calculate Response Residuals
calculate_fit_metrics

Calculate additional fit metrics for all models
calculatePearsonResiduals

Calculate Pearson Residuals
coef.gkwfit

Extract Model Coefficients from a gkwfit Object
calculatePartialResiduals

Calculate Partial Residuals
calculateParameters

Calculate Parameters for the Generalized Kumaraswamy Distribution
dbeta_

Density of the Beta Distribution (gamma, delta+1 Parameterization)
calculateQuantileResiduals

Calculate Quantile Residuals
calculateQuantiles

Calculate Quantiles for Distribution
dkkw

Density of the Kumaraswamy-Kumaraswamy (kkw) Distribution
dbkw

Density of the Beta-Kumaraswamy (BKw) Distribution
coef.gkwreg

Extract Coefficients from a Fitted GKw Regression Model
confint.gkwfit

Compute Confidence Intervals for gkwfit Parameters
.calculate_prediction_metrics

Calculate Prediction Accuracy Metrics
create_comparison_plots

Create enhanced comparison plots of all fitted distributions
.calculate_moment_comparisons

Calculate Moment Comparisons
create_comparison_table

Create comparison table of fit statistics with expanded metrics
.calculate_theoretical_quantiles

Calculate Theoretical Quantiles for GKw Family Distributions
.calculate_likelihood_statistics

Calculate Likelihood Statistics
.calculate_model_parameters

Calculate model parameters for the specified family
.calculate_distance_tests

Calculate Distance-Based Test Statistics
dgkw

Density of the Generalized Kumaraswamy Distribution
dekw

Density of the Exponentiated Kumaraswamy (EKw) Distribution
dkw

Density of the Kumaraswamy (Kw) Distribution
.convert_links_to_int

Convert Link Function Names to TMB Integers
.calculate_residuals

Calculate residuals based on the specified type
.calculate_sample_moments

Calculate Sample Moments
.calculate_sim_residuals

Calculate residuals for simulated data
.calculate_gof

Calculate goodness-of-fit statistics
.extract_model_params

Extract model parameters from a gkwreg object with family-specific handling
.check_and_compile_TMB_code

Check and Compile TMB Model Code with Persistent Cache
.calculate_theoretical_pdf

Calculate Theoretical PDF Values for GKw Family Distributions
.calculate_theoretical_moments

Calculate Theoretical Moments for GKw Family Distributions
dmc

Density of the McDonald (Mc)/Beta Power Distribution Distribution
.calculate_theoretical_cdf

Calculate Theoretical CDF Values for GKw Family Distributions
.calculate_diagnostic_measures

Calculate diagnostic measures for gkwreg plots
.extract_parameter_vectors

Extract parameter vectors from parameter matrix
.extract_model_matrices

Extract model matrices from a gkwreg object with family-specific handling
.format_coefficient_names

Format Coefficient Names Based on Family and Model Matrices
.plot_gkwreg_ggplot

Generate diagnostic plots using ggplot2
.prepare_tmb_data

Prepare TMB Data for GKw Regression
.fit_submodels_tmb

Fit submodels for the GKw family for model comparison
.get_family_fixed_defaults

Get default fixed parameters for each GKw family
.create_bar_plot

Create Bar Plot for Model Comparison
calculateProbabilities

Calculate Cumulative Probabilities for Distribution
.extract_model_data

Extract Model Data for GKw Regression
.generate_additional_plots

Generate Additional Diagnostic Plots Beyond Those in gkwfit
.calculate_half_normal_data

Calculate half-normal plot data with envelope
.create_radar_plot

Create Radar Plot for Model Comparison
.family_to_code

Convert family string to numeric code for TMB
.plot_base_r_half_normal

Plot half-normal plot (base R)
.fit_tmb

Fit GKw family distributions using TMB
.get_family_param_info

Get family parameter information
.create_plot_titles

Create formatted plot titles
get_bounded_datasets

Access datasets from bounded response regression packages
.plot_base_r_leverage_vs_fitted

Plot leverage vs. fitted (base R)
.fit_submodels

Fit submodels for comparison
.validate_and_prepare_gkwreg_diagnostics

Validate inputs and prepare diagnostic data for gkwreg plots
.simulate_p_values_bootstrap

Simulate P-Values Using Parametric Bootstrap
.calculate_information_criteria

Calculate Information Criteria
.get_default_fixed

Get default fixed parameters for a family
generate_report

Generate R Markdown report with analysis results
.calculate_profiles

Calculate profile likelihoods
.plot_ggplot_predicted_vs_observed

Plot predicted vs. observed (ggplot2)
.plot_base_r_cooks_distance

Plot Cook's distance (base R)
.calculate_probability_plot_metrics

Calculate Probability Plot Metrics
.create_table_plot

Create Table Plot for Model Comparison
.determine_start_values

Determine initial parameter values
.get_default_start

Get default start values for a family
.map_gkwreg_to_tmb_param

Map gkwreg parameter index to TMB parameter index
.validate_data

Validate data for GKw family distributions
.plot_base_r_residuals_vs_linpred

Plot residuals vs. linear predictor (base R)
.plot_ggplot_cooks_distance

Plot Cook's distance (ggplot2)
.validate_parameters

Validate parameters for GKw family distributions
.plot_ggplot_residuals_vs_linpred

Plot residuals vs. linear predictor (ggplot2)
.process_formula_parts

Process Formula Parts from a Formula Object
.plot_gkwreg_base_r

Generate diagnostic plots using base R graphics
.process_fixed

Process Fixed Parameters for GKw Regression
get_quantile_function

Get the quantile function for a fitted GKw distribution model
.plot_ggplot_residuals_vs_index

Plot residuals vs. index (ggplot2)
.simulate_from_distribution

Simulate observations from a specified distribution family
.plot_ggplot_leverage_vs_fitted

Plot leverage vs. fitted (ggplot2)
.generate_plots

Generate diagnostic plots for distribution models
get_cdf_function

Get the CDF function for a fitted GKw distribution model
get_density_function

Get the density function for a fitted GKw distribution model
grbeta

Gradient of the Negative Log-Likelihood for the Beta Distribution (gamma, delta+1 Parameterization)
gkwgof

Comprehensive Goodness-of-Fit Analysis for GKw Family Distributions
.plot_ggplot_half_normal

Plot half-normal plot (ggplot2)
.generate_random_samples

Generate Random Samples from GKw Family Distributions
.sample_model_data

Sample model data for large datasets
gkwreg

Fit Generalized Kumaraswamy Regression Models
gkwfitall

Fit All or Selected Generalized Kumaraswamy Family Distributions and Compare Them
gkwfit

Fit Generalized Kumaraswamy Distribution via Maximum Likelihood Estimation using TMB
gkwgetstartvalues

Main function to estimate GKw distribution parameters using the method of moments. This implementation is optimized for numerical stability and computational efficiency.
grbkw

Gradient of the Negative Log-Likelihood for the BKw Distribution
.plot_base_r_predicted_vs_observed

Plot predicted vs. observed (base R)
grgkw

Gradient of the Negative Log-Likelihood for the GKw Distribution
grkkw

Gradient of the Negative Log-Likelihood for the kkw Distribution
grekw

Gradient of the Negative Log-Likelihood for the EKw Distribution
grkw

Gradient of the Negative Log-Likelihood for the Kumaraswamy (Kw) Distribution
grmc

Gradient of the Negative Log-Likelihood for the McDonald (Mc)/Beta Power Distribution
hskkw

Hessian Matrix of the Negative Log-Likelihood for the kkw Distribution
llbkw

Negative Log-Likelihood for Beta-Kumaraswamy (BKw) Distribution
hsgkw

Hessian Matrix of the Negative Log-Likelihood for the GKw Distribution
llekw

Negative Log-Likelihood for the Exponentiated Kumaraswamy (EKw) Distribution
.plot_base_r_residuals_vs_index

Plot residuals vs. index (base R)
.print_gof_summary

Print Formatted Summary of Goodness-of-Fit Statistics
hsbeta

Hessian Matrix of the Negative Log-Likelihood for the Beta Distribution (gamma, delta+1 Parameterization)
.prepare_tmb_params

Prepare TMB Parameters for GKw Regression
.process_link

Process Link Functions for GKw Regression
extract_gof_stats

Extract Key Statistics from gkwgof Objects
list_bounded_datasets

List all available datasets for bounded response regression
hsbkw

Hessian Matrix of the Negative Log-Likelihood for the BKw Distribution
hsekw

Hessian Matrix of the Negative Log-Likelihood for the EKw Distribution
.process_link_scale

Process Link Scales for GKw Regression
hskw

Hessian Matrix of the Negative Log-Likelihood for the Kw Distribution
fitted.gkwreg

Extract Fitted Values from a Generalized Kumaraswamy Regression Model
hsmc

Hessian Matrix of the Negative Log-Likelihood for the McDonald (Mc)/Beta Power Distribution
pbkw

Cumulative Distribution Function (CDF) of the Beta-Kumaraswamy (BKw) Distribution
llbeta

Negative Log-Likelihood for the Beta Distribution (gamma, delta+1 Parameterization)
logLik.gkwfit

Extract Log-Likelihood from a gkwfit Object
llkw

Negative Log-Likelihood of the Kumaraswamy (Kw) Distribution
logLik.gkwreg

Extract Log-Likelihood from a Generalized Kumaraswamy Regression Model
llgkw

Negative Log-Likelihood for the Generalized Kumaraswamy Distribution
pbeta_

CDF of the Beta Distribution (gamma, delta+1 Parameterization)
nrgkw

Enhanced Newton-Raphson Optimization for GKw Family Distributions
llkkw

Negative Log-Likelihood for the kkw Distribution
pekw

Cumulative Distribution Function (CDF) of the EKw Distribution
pkkw

Cumulative Distribution Function (CDF) of the kkw Distribution
plot.gkwreg

Diagnostic Plots for Generalized Kumaraswamy Regression Models
pgkw

Generalized Kumaraswamy Distribution CDF
plot.gkwgof

Plot Method for gkwgof Objects
%>%

Pipe operator
pkw

Cumulative Distribution Function (CDF) of the Kumaraswamy (Kw) Distribution
plot.gkwfitall

Plot method for gkwfitall objects
plotcompare

Compare Goodness-of-Fit Results Across Multiple Models
llmc

Negative Log-Likelihood for the McDonald (Mc)/Beta Power Distribution
plot.gkwfit

Plot Diagnostics for a gkwfit Object
pmc

CDF of the McDonald (Mc)/Beta Power Distribution
print.summary.gkwgof

Print Method for summary.gkwgof Objects
print.gkwfit

Print Method for gkwfit Objects
print.gkwfitall

Print method for gkwfitall objects
print.anova.gkwfit

S3 method for class 'anova.gkwfit'
print.gkwgof

Print Method for gkwgof Objects
print.summary.gkwfitall

Print method for summary.gkwfitall objects
predict.gkwreg

Predictions from a Fitted Generalized Kumaraswamy Regression Model
print.summary.gkwreg

Print Method for Generalized Kumaraswamy Regression Summaries
print.summary.gkwfit

Print Method for summary.gkwfit Objects
qbeta_

Quantile Function of the Beta Distribution (gamma, delta+1 Parameterization)
qkw

Quantile Function of the Kumaraswamy (Kw) Distribution
residuals.gkwreg

Extract Residuals from a Generalized Kumaraswamy Regression Model
rbkw

Random Number Generation for the Beta-Kumaraswamy (BKw) Distribution
qbkw

Quantile Function of the Beta-Kumaraswamy (BKw) Distribution
rbeta_

Random Generation for the Beta Distribution (gamma, delta+1 Parameterization)
rekw

Random Number Generation for the Exponentiated Kumaraswamy (EKw) Distribution
qmc

Quantile Function of the McDonald (Mc)/Beta Power Distribution
qkkw

Quantile Function of the Kumaraswamy-Kumaraswamy (kkw) Distribution
qgkw

Generalized Kumaraswamy Distribution Quantile Function
qekw

Quantile Function of the Exponentiated Kumaraswamy (EKw) Distribution
summary.gkwgof

Summary Method for gkwgof Objects
vcov.gkwreg

Extract Variance-Covariance Matrix from a Generalized Kumaraswamy Regression Model
summary.gkwreg

Summary Method for Generalized Kumaraswamy Regression Models
summary.gkwfit

Summary Method for gkwfit Objects
summary.gkwfitall

Summary method for gkwfitall objects
rmc

Random Number Generation for the McDonald (Mc)/Beta Power Distribution
rgkw

Generalized Kumaraswamy Distribution Random Generation
vcov.gkwfit

Extract Variance-Covariance Matrix from a gkwfit Object
rkw

Random Number Generation for the Kumaraswamy (Kw) Distribution
rkkw

Random Number Generation for the kkw Distribution
AIC.gkwreg

Akaike's Information Criterion for GKw Regression Models
calculateModifiedDevianceResiduals

Calculate Modified Deviance Residuals
calculateDevianceResiduals

Calculate Deviance Residuals
calculateDensities

Calculate Densities for Distribution
calculateMeans

Calculate Means for Distribution
calculateCoxSnellResiduals

Calculate Cox-Snell Residuals
anova.gkwfit

Compare Fitted gkwfit Models using Likelihood Ratio Tests
BIC.gkwfit

Calculate Bayesian Information Criterion (BIC) for gkwfit Objects
AIC.gkwfit

Calculate AIC or BIC for gkwfit Objects
BIC.gkwreg

Bayesian Information Criterion for GKw Regression Models