GlmSimulatoR v0.1.0

0

Monthly downloads

0th

Percentile

Creates Ideal Data for Generalized Linear Models

Have you ever struggled to find "good data" for a generalized linear model? Would you like to test how quickly statistics converge to parameters, or learn how picking different link functions affects model performance? This package creates ideal data for both common and novel generalized linear models so your questions can be empirically answered.

Readme

GlmSimulatoR

Often the first problem in understanding the generalized linear model in a practical way is finding good data. Common problems in finding data are a small amount of rows, the response variable does not follow a family in the gm framework, or the data is messy and needs a lot of work before statistical analysis can begin. This package alleviates all of these by allowing you to create the data you want. With data in hand, you can empirically answer any question you have.

The goal of this package is to strike a balance between mathematical flexibility and simplicity of use. You can control the sample size, link function, number of unrelated variables, and dispersion for continuous distributions. Default values are carefully chosen so data can be generated without thinking about mathematical connections between weights, links, and distributions.

Installation

You can install the released version of GlmSimulatoR from CRAN with:

#Currently not on cran. Will be soon.
install.packages("GlmSimulatoR")

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("gmcmacran/GlmSimulatoR")

Example

library(GlmSimulatoR)

#Do glm and lm estimate the same weights? Yes
set.seed(1)
simdata <- simulate_gaussian() #GlmSimulatoR function
linearModel <- lm(Y ~ X1 + X2 + X3, data = simdata)
glmModel <- glm(Y ~ X1 + X2 + X3, data = simdata, family = gaussian(link = "identity"))
summary(linearModel)
#> 
#> Call:
#> lm(formula = Y ~ X1 + X2 + X3, data = simdata)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -3.6961 -0.6711  0.0049  0.6534  3.6232 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  3.06105    0.08961   34.16   <2e-16 ***
#> X1           0.99941    0.03428   29.15   <2e-16 ***
#> X2           1.98930    0.03456   57.56   <2e-16 ***
#> X3           2.98383    0.03471   85.97   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.9976 on 9996 degrees of freedom
#> Multiple R-squared:  0.5377, Adjusted R-squared:  0.5375 
#> F-statistic:  3875 on 3 and 9996 DF,  p-value: < 2.2e-16
summary(glmModel)
#> 
#> Call:
#> glm(formula = Y ~ X1 + X2 + X3, family = gaussian(link = "identity"), 
#>     data = simdata)
#> 
#> Deviance Residuals: 
#>     Min       1Q   Median       3Q      Max  
#> -3.6961  -0.6711   0.0049   0.6534   3.6232  
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  3.06105    0.08961   34.16   <2e-16 ***
#> X1           0.99941    0.03428   29.15   <2e-16 ***
#> X2           1.98930    0.03456   57.56   <2e-16 ***
#> X3           2.98383    0.03471   85.97   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for gaussian family taken to be 0.9952888)
#> 
#>     Null deviance: 21518.1  on 9999  degrees of freedom
#> Residual deviance:  9948.9  on 9996  degrees of freedom
#> AIC: 28338
#> 
#> Number of Fisher Scoring iterations: 2
rm(linearModel, glmModel, simdata)

See Vignettes For More Examples

Functions in GlmSimulatoR

Name Description
simulate_gaussian Create ideal data for a generalized linear model.
%>% Pipe operator
No Results!

Vignettes of GlmSimulatoR

Name
dealing_with_right_skewed_data.Rmd
exploring_links_for_the_gaussian_distribution.Rmd
forward_stepwise_search.Rmd
introduction.Rmd
No Results!

Last month downloads

Details

Type Package
License GPL-3
Encoding UTF-8
LazyData true
RoxygenNote 6.1.1
VignetteBuilder knitr
NeedsCompilation no
Packaged 2019-08-09 02:01:15 UTC; gmcma
Repository CRAN
Date/Publication 2019-08-12 07:20:02 UTC

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/GlmSimulatoR)](http://www.rdocumentation.org/packages/GlmSimulatoR)