Learn R Programming

bootGOF

Bootstrap based goodness-of-fit tests for (linear) models. Assume you have fitted a statistical model, e.g. classical linear model or generalized linear model or a model that follows (Y = m(\beta^\top X) + \epsilon). This package allows to perform a rigorous statistical test to check if the chosen model family is correct.

Example

First we generate a data-set in order to apply the package.

set.seed(1)
N <- 100
X1 <- rnorm(N)
X2 <- rnorm(N)
d <- data.frame(
  y = rpois(n = N, lambda = exp(4 + X1 * 2 + X2 * 6)),
  x1 = X1,
  x2 = X2)

Note that both covariates influence the dependent variable (Y). Taking only one of the covariates into account obviously leads to a model family that is not correct and the GOF-test should reveal that:

fit <- glm(y ~ x1, data = d, family = poisson())

library(bootGOF)
mt <- GOF_model(
  model = fit,
  data = d,
  nmb_boot_samples = 100,
  simulator_type = "parametric",
  y_name = "y",
  Rn1_statistic = Rn1_KS$new())
mt$get_pvalue()
#> [1] 0

On the other hand assuming the correct model family should in general not be rejected by the GOF-test:

fit <- glm(y ~ x1 + x2, data = d, family = poisson())
mt <- GOF_model(
  model = fit,
  data = d,
  nmb_boot_samples = 100,
  simulator_type = "parametric",
  y_name = "y",
  Rn1_statistic = Rn1_KS$new())
mt$get_pvalue()
#> [1] 0.61

Installation

You can install it from CRAN

install.packages("bootGOF")

or github

devtools::install_github("MarselScheer/bootGOF")

sessionInfo

sessionInfo()
#> R version 4.0.0 (2020-04-24)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 20.04 LTS
#> 
#> Matrix products: default
#> BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=C             
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices datasets  utils     methods   base     
#> 
#> other attached packages:
#> [1] bootGOF_0.1.0     badgecreatr_0.2.0
#> 
#> loaded via a namespace (and not attached):
#>  [1] digest_0.6.25   R6_2.4.1        backports_1.1.8 git2r_0.27.1   
#>  [5] magrittr_1.5    evaluate_0.14   rlang_0.4.10    stringi_1.4.6  
#>  [9] renv_0.10.0     checkmate_2.0.0 rmarkdown_2.3   tools_4.0.0    
#> [13] stringr_1.4.0   xfun_0.15       yaml_2.2.1      compiler_4.0.0 
#> [17] htmltools_0.5.0 knitr_1.29

Copy Link

Version

Install

install.packages('bootGOF')

Monthly Downloads

143

Version

0.1.0

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Marsel Scheer

Last Published

June 24th, 2021

Functions in bootGOF (0.1.0)

GOF_lm_trainer

Implements the "interface" GOF_model_trainer for for linear models
GOF_model_info_extractor

R6 Class representing model information
GOF_lm_info_extractor

Implements the "interface" GOF_model_info_extractor for linear models
GOF_model_simulator

R6 Class representing a generator/resample of the dependent variable
GOF_model_resample

R6 Class representing the resampling scheme for Goodness-of-fit-tests for (linear) models
GOF_model

Convenience function for creating a GOF-test for statistical models
GOF_glm_trainer

Implements the "interface" GOF_model_trainer for for generalized linear models
GOF_glm_info_extractor

Implements the "interface" GOF_model_info_extractor for for generalized linear models
GOF_lm_sim_param

Implements the "interface" GOF_model_simulator for for linear models
rrademacher

Generates Rademacher distributed random variables
Rn1_KS

Kolmogorov-Smirnov-statistic for marked empirical process
GOF_glm_sim_param

Implements the "interface" GOF_model_simulator for for generalized linear models
GOF_sim_wild_rademacher

Implements the "interface" GOF_model_simulator in a semi-parametric fashion
Rn1_CvM

Cramer-von-Mises-statistic for marked empirical process
Rn1_statistic

R6 Class representing statistics for marked empirical processes
GOF_model_test

R6 Class representing the Goodness-of-Fit test for (linear) models.
GOF_model_trainer

R6 Class representing a trainer for fitting models