Learn R Programming

gkwreg: Generalized Kumaraswamy Regression Models for Bounded Data

Overview

The gkwreg package provides a robust and efficient framework for modeling data restricted to the standard unit interval $(0, 1)$, such as proportions, rates, fractions, or indices. While the Beta distribution is commonly used for such data, gkwreg focuses on the Generalized Kumaraswamy (GKw) distribution family, offering enhanced flexibility by encompassing several important bounded distributions (including Beta and Kumaraswamy) as special cases.

The package facilitates both distribution fitting and regression modeling with potentially all distribution parameters modeled as functions of covariates using various link functions. Estimation is performed efficiently via Maximum Likelihood leveraging the Template Model Builder (TMB) framework, which utilizes automatic differentiation for superior speed, accuracy, and stability.

Key Features

  • Flexible Distribution Family: Model data using the 5-parameter Generalized Kumaraswamy (GKw) distribution and its seven key nested sub-families:

    DistributionCodeParameters ModeledFixed Parameters# Par.
    Generalized Kumaraswamygkwalpha, beta, gamma, delta, lambdaNone5
    Beta-Kumaraswamybkwalpha, beta, gamma, deltalambda = 14
    Kumaraswamy-Kumaraswamykkwalpha, beta, delta, lambdagamma = 14
    Exponentiated Kumaraswamyekwalpha, beta, lambdagamma = 1, delta = 03
    McDonald / Beta Powermcgamma, delta, lambdaalpha = 1, beta = 13
    Kumaraswamykwalpha, betagamma = 1, delta = 0, lambda = 12
    Betabetagamma, deltaalpha = 1, beta = 1, lambda = 12
  • Advanced Regression Modeling (gkwreg): Independently model each relevant distribution parameter as a function of covariates using a flexible formula interface:

    y ~ alpha_terms | beta_terms | gamma_terms | delta_terms | lambda_terms
  • Multiple Link Functions: Choose appropriate link functions for each parameter, including:

    • log (default for all parameters)
    • logit, probit, cloglog (with optional scaling)
    • identity, inverse, sqrt
  • Efficient Estimation: Utilizes the TMB package for fast and stable Maximum Likelihood Estimation, leveraging automatic differentiation for precise gradient and Hessian calculations.

  • Standard R Interface: Provides familiar methods like summary(), predict(), plot(), coef(), vcov(), logLik(), AIC(), BIC(), residuals() for model inspection, inference, and diagnostics.

  • Distribution Utilities: Implements standard d*, p*, q*, r* also as analytical log-likelihood ll*, gradient gr* and hessian hs* functions for all supported distributions in C++/RcppArmadillo.

Installation

# Install the stable version from CRAN:
install.packages("gkwreg")

# Or install the development version from GitHub:
# install.packages("devtools")
devtools::install_github("evandeilton/gkwreg")

Mathematical Background

The Generalized Kumaraswamy (GKw) Distribution

The GKw distribution is a flexible five-parameter distribution for variables on $(0, 1)$. Its cumulative distribution function (CDF) is given by:

$$F(x; \alpha, \beta, \gamma, \delta, \lambda) = I_{[1-(1-x^{\alpha})^{\beta}]^{\lambda}}(\gamma, \delta)$$

where $I_z(a,b)$ is the regularized incomplete beta function, and $\alpha, \beta, \gamma, \delta, \lambda > 0$ are the distribution parameters. The corresponding probability density function (PDF) is:

$$f(x; \alpha, \beta, \gamma, \delta, \lambda) = \frac{\lambda \alpha \beta x^{\alpha-1}}{B(\gamma, \delta)} (1-x^{\alpha})^{\beta-1} [1-(1-x^{\alpha})^{\beta}]^{\gamma\lambda-1} {1-[1-(1-x^{\alpha})^{\beta}]^{\lambda}}^{\delta-1}$$

where $B(\gamma, \delta)$ is the beta function.

The five parameters collectively provide exceptional flexibility in modeling distributions on $(0, 1)$: - Parameters alpha and beta primarily govern the basic shape inherited from the Kumaraswamy distribution - Parameters gamma and delta affect tail behavior and concentration around modes - Parameter lambda introduces additional flexibility, influencing skewness and peak characteristics

This parameterization allows the GKw distribution to capture a wide spectrum of shapes, including symmetric, skewed, unimodal, bimodal, J-shaped, U-shaped, and bathtub-shaped forms.

Regression Framework

In the regression setting, we assume that the response variable $y_i \in (0,1)$ follows a distribution from the GKw family with parameters $\theta_i = (\alpha_i, \beta_i, \gamma_i, \delta_i, \lambda_i)^{\top}$. Each parameter $\theta_{ip}$ (where $p \in {$alpha, beta, gamma, delta, lambda$}$) can depend on covariates through a link function $g_p(\cdot)$:

$$g_p(\theta_{ip}) = \eta_{ip} = \mathbf{x}_{ip}^{\top}\boldsymbol{\beta}_p$$

where $\eta_{ip}$ is the linear predictor, and $\boldsymbol{\beta}p$ is the vector of regression coefficients. Equivalently, $\theta{ip} = g_p^{-1}(\eta_{ip})$. The default link function is log for all parameters, ensuring the positivity constraint.

Parameters are estimated using maximum likelihood, with the log-likelihood function:

$$\ell(\Theta; \mathbf{y}, \mathbf{X}) = \sum_{i=1}^{n} \log f(y_i; \theta_i)$$

where each parameter $\theta_{ip}$ depends on $\Theta$ (the complete set of regression coefficients) via the link functions and linear predictors.

Computational Engine: TMB

The package uses Template Model Builder (TMB) (Kristensen et al. 2016) as its computational backend. TMB translates the statistical model into C++ templates and uses Automatic Differentiation (AD) to compute exact gradients and Hessians, providing several advantages:

  • Speed: AD combined with compiled C++ is significantly faster than numerical differentiation or pure R implementations
  • Accuracy: AD provides derivatives accurate to machine precision
  • Stability: Precise derivatives improve optimization stability and convergence reliability
  • Scalability: Efficiently handles models with many parameters

Examples

Regression Modeling

Model parameters of a GKw family distribution as functions of covariates:

library(gkwreg)

# Simulate data for a Kumaraswamy regression model
set.seed(123)
n <- 100
x1 <- runif(n, -2, 2)
x2 <- rnorm(n)

# Simulate true parameters (using log link)
alpha_true <- exp(0.8 + 0.3 * x1 - 0.2 * x2) 
beta_true  <- exp(1.2 - 0.4 * x1 + 0.1 * x2)

# Generate response
y <- rkw(n, alpha = alpha_true, beta = beta_true)
y <- pmax(pmin(y, 1 - 1e-7), 1e-7)  # Ensure y in (0, 1)
df1 <- data.frame(y = y, x1 = x1, x2 = x2)

# Fit Kumaraswamy regression: alpha ~ x1 + x2, beta ~ x1 + x2
kw_model <- gkwreg(y ~ x1 + x2 | x1 + x2, data = df1, family = "kw")
summary(kw_model)

Real Data Analysis

# Food Expenditure Data
library(gkwreg)
data("FoodExpenditure", package = "betareg")
FoodExpenditure$y <- FoodExpenditure$food/FoodExpenditure$income

# Fit models from different GKw families
kkw_model <- gkwreg(y ~ income, data = FoodExpenditure, family = "kkw")
ekw_model <- gkwreg(y ~ income, data = FoodExpenditure, family = "ekw")
kw_model <- gkwreg(y ~ income, data = FoodExpenditure, family = "kw")

# Compare models
data.frame(
  logLik = rbind(logLik(kkw_model), logLik(ekw_model), logLik(kw_model)),
  AIC = rbind(AIC(kkw_model), AIC(ekw_model), AIC(kw_model)),
  BIC = rbind(BIC(kkw_model), BIC(ekw_model), BIC(kw_model))
)

# Summary
summary(kw_model)

res <- residuals(kw_model, type = "quantile")

# Visual diagnostics
plot(kw_model)

# Predicted
pred <- predict(kw_model)

Distribution Fitting

Fit a GKw family distribution to univariate data (no covariates):

# Simulate data from Beta(2, 3)
set.seed(2203)
y_beta <- rbeta_(1000, gamma = 2, delta = 3)

# Fit Beta and Kumaraswamy distributions
fit_beta <- gkwfit(data = y_beta, family = "beta")
fit_kw <- gkwfit(data = y_beta, family = "kw")

# Compare models
summary(fit_beta)
summary(fit_kw)
AIC(fit_beta)
AIC(fit_kw)

Diagnostic Methods

The package provides several diagnostic tools for model assessment:

# Residual analysis
model <- gkwreg(y ~ x1 | x2, data = mydata, family = "kw")
res <- residuals(model, type = "quantile")  # Randomized quantile residuals

# Visual diagnostics
plot(model)  # QQ-plot, residuals vs. fitted, etc.

pred <- predict(model, type = "response")

References

  • Cordeiro, G. M., & de Castro, M. (2011). A new family of generalized distributions. Journal of Statistical Computation and Simulation, 81(7), 883-898.

  • Carrasco, J. M. F., Ferrari, S. L. P., & Cordeiro, G. M. (2010). A new generalized Kumaraswamy distribution. arXiv preprint arXiv:1004.0911.

  • Jones, M. C. (2009). Kumaraswamy’s distribution: A beta-type distribution with some tractability advantages. Statistical Methodology, 6(1), 70-81.

  • Kristensen, K., Nielsen, A., Berg, C. W., Skaug, H., & Bell, B. M. (2016). TMB: Automatic Differentiation and Laplace Approximation. Journal of Statistical Software, 70(5), 1-21.

  • Kumaraswamy, P. (1980). A generalized probability density function for double-bounded random processes. Journal of Hydrology, 46(1-2), 79-88.

  • Ferrari, S. L. P., & Cribari-Neto, F. (2004). Beta regression for modelling rates and proportions. Journal of Applied Statistics, 31(7), 799-815.

  • Cribari-Neto, F., & Zeileis, A. (2010). Beta Regression in R. Journal of Statistical Software, 34(2), 1-24.

  • Lopes, J. E. (2025). Generalized Kumaraswamy Regression Models with gkwreg. Journal of Statistical Software, forthcoming.

Comparing with Other Packages

The gkwreg package complements and extends existing approaches for modeling bounded data:

Featuregkwregbetareggamlssbrms
Distribution FamilyGKw hierarchy (7 distributions)BetaMultipleMultiple
Estimation MethodMLE via TMBMLEMLE/GAMLSSBayesian
Parameter ModelingAll parametersMean, precisionAll parametersAll parameters
Computation SpeedFast (TMB + AD)FastModerateSlow (MCMC)
Default Linkloglogit (mean)Distribution-specificDistribution-specific
Random EffectsNoNoYesYes

Contributing

Contributions to gkwreg are welcome! Please feel free to submit issues or pull requests on the GitHub repository.

License

This package is licensed under the MIT License. See the LICENSE file for details.

Author and Maintainer

Lopes, J. E. (evandeilton@gmail.com)
LEG - Laboratório de Estatística e Geoinformação
UFPR - Universidade Federal do Paraná, Brazil

Copy Link

Version

Install

install.packages('gkwreg')

Monthly Downloads

277

Version

1.0.10

License

MIT + file LICENSE

Maintainer

Lopes J. E.

Last Published

July 9th, 2025

Functions in gkwreg (1.0.10)

create_comparison_table

Create comparison table of fit statistics with expanded metrics
.calculate_distance_tests

Calculate Distance-Based Test Statistics
.calculate_diagnostic_measures

Calculate diagnostic measures for gkwreg plots
dgkw

Density of the Generalized Kumaraswamy Distribution
dekw

Density of the Exponentiated Kumaraswamy (EKw) Distribution
dkw

Density of the Kumaraswamy (Kw) Distribution
dbeta_

Density of the Beta Distribution (gamma, delta+1 Parameterization)
dkkw

Density of the Kumaraswamy-Kumaraswamy (kkw) Distribution
.calculate_model_parameters

Calculate model parameters for the specified family
.calculate_profiles

Calculate profile likelihoods
.calculate_residuals

Calculate residuals based on the specified type
dbkw

Density of the Beta-Kumaraswamy (BKw) Distribution
.calculate_moment_comparisons

Calculate Moment Comparisons
.calculate_probability_plot_metrics

Calculate Probability Plot Metrics
.calculate_prediction_metrics

Calculate Prediction Accuracy Metrics
dmc

Density of the McDonald (Mc)/Beta Power Distribution Distribution
.calculate_information_criteria

Calculate Information Criteria
.calculate_likelihood_statistics

Calculate Likelihood Statistics
.calculate_sim_residuals

Calculate residuals for simulated data
.check_and_compile_TMB_code

Check and Compile TMB Model Code with Persistent Cache
.convert_links_to_int

Convert Link Function Names to TMB Integers
.calculate_sample_moments

Calculate Sample Moments
.extract_parameter_vectors

Extract parameter vectors from parameter matrix
.extract_model_data

Extract Model Data for GKw Regression
.create_table_plot

Create Table Plot for Model Comparison
.calculate_theoretical_pdf

Calculate Theoretical PDF Values for GKw Family Distributions
.determine_start_values

Determine initial parameter values
.create_plot_titles

Create formatted plot titles
.create_radar_plot

Create Radar Plot for Model Comparison
.generate_random_samples

Generate Random Samples from GKw Family Distributions
.calculate_theoretical_quantiles

Calculate Theoretical Quantiles for GKw Family Distributions
.create_bar_plot

Create Bar Plot for Model Comparison
.get_default_fixed

Get default fixed parameters for a family
.plot_base_r_residuals_vs_index

Plot residuals vs. index (base R)
.fit_tmb

Fit GKw family distributions using TMB
.plot_ggplot_leverage_vs_fitted

Plot leverage vs. fitted (ggplot2)
.calculate_theoretical_moments

Calculate Theoretical Moments for GKw Family Distributions
.calculate_half_normal_data

Calculate half-normal plot data with envelope
.plot_base_r_residuals_vs_linpred

Plot residuals vs. linear predictor (base R)
.calculate_theoretical_cdf

Calculate Theoretical CDF Values for GKw Family Distributions
.fit_submodels

Fit submodels for comparison
.calculate_gof

Calculate goodness-of-fit statistics
.family_to_code

Convert family string to numeric code for TMB
.get_default_start

Get default start values for a family
.map_gkwreg_to_tmb_param

Map gkwreg parameter index to TMB parameter index
.plot_ggplot_predicted_vs_observed

Plot predicted vs. observed (ggplot2)
.get_family_param_info

Get family parameter information
.fit_submodels_tmb

Fit submodels for the GKw family for model comparison
.format_coefficient_names

Format Coefficient Names Based on Family and Model Matrices
.extract_model_matrices

Extract model matrices from a gkwreg object with family-specific handling
.process_formula_parts

Process Formula Parts from a Formula Object
.get_family_fixed_defaults

Get default fixed parameters for each GKw family
.plot_base_r_cooks_distance

Plot Cook's distance (base R)
.plot_base_r_half_normal

Plot half-normal plot (base R)
.plot_ggplot_residuals_vs_index

Plot residuals vs. index (ggplot2)
.plot_ggplot_residuals_vs_linpred

Plot residuals vs. linear predictor (ggplot2)
.process_link

Process Link Functions for GKw Regression
.sample_model_data

Sample model data for large datasets
.process_link_scale

Process Link Scales for GKw Regression
.extract_model_params

Extract model parameters from a gkwreg object with family-specific handling
grbeta

Gradient of the Negative Log-Likelihood for the Beta Distribution (gamma, delta+1 Parameterization)
.plot_ggplot_cooks_distance

Plot Cook's distance (ggplot2)
.generate_additional_plots

Generate Additional Diagnostic Plots Beyond Those in gkwfit
generate_report

Generate R Markdown report with analysis results
get_quantile_function

Get the quantile function for a fitted GKw distribution model
get_density_function

Get the density function for a fitted GKw distribution model
.generate_plots

Generate diagnostic plots for distribution models
fitted.gkwreg

Extract Fitted Values from a Generalized Kumaraswamy Regression Model
.plot_gkwreg_base_r

Generate diagnostic plots using base R graphics
.plot_base_r_leverage_vs_fitted

Plot leverage vs. fitted (base R)
.simulate_p_values_bootstrap

Simulate P-Values Using Parametric Bootstrap
.simulate_from_distribution

Simulate observations from a specified distribution family
grkw

Gradient of the Negative Log-Likelihood for the Kumaraswamy (Kw) Distribution
grkkw

Gradient of the Negative Log-Likelihood for the kkw Distribution
grbkw

Gradient of the Negative Log-Likelihood for the BKw Distribution
.print_gof_summary

Print Formatted Summary of Goodness-of-Fit Statistics
get_bounded_datasets

Access datasets from bounded response regression packages
.plot_ggplot_half_normal

Plot half-normal plot (ggplot2)
.plot_gkwreg_ggplot

Generate diagnostic plots using ggplot2
.plot_base_r_predicted_vs_observed

Plot predicted vs. observed (base R)
.prepare_tmb_data

Prepare TMB Data for GKw Regression
extract_gof_stats

Extract Key Statistics from gkwgof Objects
.prepare_tmb_params

Prepare TMB Parameters for GKw Regression
.process_fixed

Process Fixed Parameters for GKw Regression
.validate_parameters

Validate parameters for GKw family distributions
hsmc

Hessian Matrix of the Negative Log-Likelihood for the McDonald (Mc)/Beta Power Distribution
grmc

Gradient of the Negative Log-Likelihood for the McDonald (Mc)/Beta Power Distribution
hskw

Hessian Matrix of the Negative Log-Likelihood for the Kw Distribution
list_bounded_datasets

List all available datasets for bounded response regression
gkwfitall

Fit All or Selected Generalized Kumaraswamy Family Distributions and Compare Them
hsgkw

Hessian Matrix of the Negative Log-Likelihood for the GKw Distribution
hsbeta

Hessian Matrix of the Negative Log-Likelihood for the Beta Distribution (gamma, delta+1 Parameterization)
hsbkw

Hessian Matrix of the Negative Log-Likelihood for the BKw Distribution
.validate_data

Validate data for GKw family distributions
.validate_and_prepare_gkwreg_diagnostics

Validate inputs and prepare diagnostic data for gkwreg plots
llgkw

Negative Log-Likelihood for the Generalized Kumaraswamy Distribution
llkw

Negative Log-Likelihood of the Kumaraswamy (Kw) Distribution
hsekw

Hessian Matrix of the Negative Log-Likelihood for the EKw Distribution
pkkw

Cumulative Distribution Function (CDF) of the kkw Distribution
hskkw

Hessian Matrix of the Negative Log-Likelihood for the kkw Distribution
llkkw

Negative Log-Likelihood for the kkw Distribution
plotcompare

Compare Goodness-of-Fit Results Across Multiple Models
get_cdf_function

Get the CDF function for a fitted GKw distribution model
pbeta_

CDF of the Beta Distribution (gamma, delta+1 Parameterization)
pgkw

Generalized Kumaraswamy Distribution CDF
logLik.gkwfit

Extract Log-Likelihood from a gkwfit Object
llmc

Negative Log-Likelihood for the McDonald (Mc)/Beta Power Distribution
llbeta

Negative Log-Likelihood for the Beta Distribution (gamma, delta+1 Parameterization)
nrgkw

Enhanced Newton-Raphson Optimization for GKw Family Distributions
gkwfit

Fit Generalized Kumaraswamy Distribution via Maximum Likelihood Estimation using TMB
gkwgof

Comprehensive Goodness-of-Fit Analysis for GKw Family Distributions
%>%

Pipe operator
pkw

Cumulative Distribution Function (CDF) of the Kumaraswamy (Kw) Distribution
pmc

CDF of the McDonald (Mc)/Beta Power Distribution
qkkw

Quantile Function of the Kumaraswamy-Kumaraswamy (kkw) Distribution
qgkw

Generalized Kumaraswamy Distribution Quantile Function
gkwreg

Fit Generalized Kumaraswamy Regression Models
llbkw

Negative Log-Likelihood for Beta-Kumaraswamy (BKw) Distribution
llekw

Negative Log-Likelihood for the Exponentiated Kumaraswamy (EKw) Distribution
qekw

Quantile Function of the Exponentiated Kumaraswamy (EKw) Distribution
logLik.gkwreg

Extract Log-Likelihood from a Generalized Kumaraswamy Regression Model
qbkw

Quantile Function of the Beta-Kumaraswamy (BKw) Distribution
plot.gkwfitall

Plot method for gkwfitall objects
rkw

Random Number Generation for the Kumaraswamy (Kw) Distribution
rgkw

Generalized Kumaraswamy Distribution Random Generation
grekw

Gradient of the Negative Log-Likelihood for the EKw Distribution
plot.gkwfit

Plot Diagnostics for a gkwfit Object
grgkw

Gradient of the Negative Log-Likelihood for the GKw Distribution
pbkw

Cumulative Distribution Function (CDF) of the Beta-Kumaraswamy (BKw) Distribution
print.anova.gkwfit

S3 method for class 'anova.gkwfit'
print.summary.gkwreg

Print Method for Generalized Kumaraswamy Regression Summaries
pekw

Cumulative Distribution Function (CDF) of the EKw Distribution
predict.gkwreg

Predictions from a Fitted Generalized Kumaraswamy Regression Model
print.gkwgof

Print Method for gkwgof Objects
plot.gkwgof

Plot Method for gkwgof Objects
rkkw

Random Number Generation for the kkw Distribution
print.gkwfitall

Print method for gkwfitall objects
print.summary.gkwfitall

Print method for summary.gkwfitall objects
qkw

Quantile Function of the Kumaraswamy (Kw) Distribution
print.gkwfit

Print Method for gkwfit Objects
plot.gkwreg

Diagnostic Plots for Generalized Kumaraswamy Regression Models
print.summary.gkwgof

Print Method for summary.gkwgof Objects
summary.gkwgof

Summary Method for gkwgof Objects
qmc

Quantile Function of the McDonald (Mc)/Beta Power Distribution
qbeta_

Quantile Function of the Beta Distribution (gamma, delta+1 Parameterization)
print.summary.gkwfit

Print Method for summary.gkwfit Objects
rekw

Random Number Generation for the Exponentiated Kumaraswamy (EKw) Distribution
summary.gkwreg

Summary Method for Generalized Kumaraswamy Regression Models
rbeta_

Random Generation for the Beta Distribution (gamma, delta+1 Parameterization)
residuals.gkwreg

Extract Residuals from a Generalized Kumaraswamy Regression Model
summary.gkwfit

Summary Method for gkwfit Objects
rmc

Random Number Generation for the McDonald (Mc)/Beta Power Distribution
summary.gkwfitall

Summary method for gkwfitall objects
rbkw

Random Number Generation for the Beta-Kumaraswamy (BKw) Distribution
vcov.gkwreg

Extract Variance-Covariance Matrix from a Generalized Kumaraswamy Regression Model
vcov.gkwfit

Extract Variance-Covariance Matrix from a gkwfit Object
confint.gkwfit

Compute Confidence Intervals for gkwfit Parameters
create_comparison_plots

Create enhanced comparison plots of all fitted distributions
BIC.gkwfit

Calculate Bayesian Information Criterion (BIC) for gkwfit Objects
anova.gkwfit

Compare Fitted gkwfit Models using Likelihood Ratio Tests
AIC.gkwreg

Akaike's Information Criterion for GKw Regression Models
coef.gkwfit

Extract Model Coefficients from a gkwfit Object
calculate_fit_metrics

Calculate additional fit metrics for all models
coef.gkwreg

Extract Coefficients from a Fitted GKw Regression Model
AIC.gkwfit

Calculate AIC or BIC for gkwfit Objects
BIC.gkwreg

Bayesian Information Criterion for GKw Regression Models