Learn R Programming

⚠️There's a newer version (1.0-4) of this package.Take me there.

stratifyR 1.0-3

The goal of stratifyR is to construct stratification boundaries using either the continuous variable in your data or a hypothetical distribution for your variable.

Installation

You can install stratifyR using:

install.packages("stratifyR")

Example

This is a basic example which shows you how to solve a common stratification problem:

library(stratifyR)
#> Loading required package: fitdistrplus
#> Loading required package: MASS
#> Loading required package: survival
#> Loading required package: zipfR
#> Loading required package: actuar
#> 
#> Attaching package: 'actuar'
#> The following objects are masked from 'package:stats':
#> 
#>     sd, var
#> The following object is masked from 'package:grDevices':
#> 
#>     cm
#> Loading required package: triangle
#> Loading required package: mc2d
#> Loading required package: mvtnorm
#> 
#> Attaching package: 'mc2d'
#> The following objects are masked from 'package:base':
#> 
#>     pmax, pmin
## basic example code using data
data(sugarcane)
Production <- sugarcane$Production
hist(Production)
res <- strata.data(Production, h = 2, n=1000)
#> The program is running, it'll take some time!
summary(res)
#> _____________________________________________
#> Optimum Strata Boundaries for h = 2 
#> Data Range: [0.42, 1570.38] with d = 1569.96
#> Best-fit Frequency Distribution:  gamma 
#> Parameter estimate(s):  
#>       shape        rate 
#> 1.520409984 0.009226811 
#> ____________________________________________________
#>  Strata     OSB   Wh       Vh   WhSh   nh    Nh   fh
#>       1  195.18 0.68  2832.16 36.219  458  9456 0.05
#>       2 1570.38 0.32 18071.08 42.939  542  4438 0.12
#>   Total         1.00 20903.24 79.158 1000 13894 0.07
#> ____________________________________________________

## basic example code using distribution
res <- strata.distr(h=2, initval=0.65, dist=68, distr = "gamma",
params = c(shape=3.8, rate=0.55), n=500, N=10000)
#> The program is running, it'll take some time!
summary(res)
#> _____________________________________________
#> Optimum Strata Boundaries for h = 2 
#> Data Range: [0.65, 68.65] with d = 68
#> Best-fit Frequency Distribution:  gamma 
#> Parameter estimate(s):  
#> shape  rate 
#>  3.80  0.55 
#> ____________________________________________________
#>  Strata   OSB   Wh    Vh  WhSh  nh    Nh   fh
#>       1  7.47 0.63  2.61 1.014 247  6279 0.04
#>       2 68.65 0.37  7.83 1.041 253  3721 0.07
#>   Total       1.00 10.44 2.055 500 10000 0.05
#> ____________________________________________________

The functions can be dynamically used to visualize the the strata boundaries, for 2 strata, over the density (or observations) of the “mag” variable from the quakes data (with purrr and ggplot2 packages loaded).

library(stratifyR)
library(ggplot2)
res <- strata.distr(h=2, initval=4, dist=2.4, distr = "lnorm", 
          params = c(meanlog=1.52681032, sdlog=0.08503554), n=300, N=1000)
#> The program is running, it'll take some time!
ggplot(data=quakes, aes(x = mag)) +
      geom_density(fill = "blue", colour = "black", alpha = 0.3) +
      geom_vline(xintercept = res$OSB, linetype = "dotted", color = "red")

Read the stratifyR vignette for a complete documentation and many more examples using 10 different distributions.

Copy Link

Version

Install

install.packages('stratifyR')

Monthly Downloads

281

Version

1.0-3

License

GPL-2

Maintainer

Karuna G. Reddy

Last Published

December 7th, 2021

Functions in stratifyR (1.0-3)

create.mat

To create and store calculated values of the objective function
strata.data

Stratification of Univariate Survey Population Using the Data
realloc

To re-allocate the stratum sample sizes (nh)
summary.strata

This method formats and outputs the final results to the R console
sugarcane

Sugarcane Farming Data in Fiji
strata.distr

Stratification of Univariate Survey Population Using the Distribution
minim.val

To identify the minimum value out of two given sets of values
mode.val

To calculate the modal value of the data
math

Mathematics Marks for First-year University Students
hies

Household Income Expenditure Survey (HIES) in Fiji
data.optim

To implement the Dynamic Programming (DP) solution procedure on the stratification problem presented in the form of a Mathematical Programming Problem (MPP)
erf

To calculate the error for a normal variable
data.alloc

To calculate the stratum sample sizes (nh) for a fixed sample size (n) directly based on the data
distr.optim

To implement the Dynamic Programming (DP) solution procedure on the stratification problem presented in the form of a Mathematical Programming Problem (MPP)
anaemia

Micronutrient data on Anaemia in Fiji
distr.alloc

To calculate the stratum sample sizes (nh) for a fixed sample size (n) based on the hypothetical distribution of the data
data.root

To calculate the objective function values
distr.root

Calculate the objective function values
get.dist

To identify the best-fit distribution of a univariate data