Learn R Programming

binsmooth (version 0.2.2)

simcounty: Simulate data to mimic county_bins and county_true

Description

Samples from a selection of distributions (Gamma, Lognormal, Weibull, Triangle) to simulate income data in the format used in the American Community Survey data (county_bins and county_true).

Usage

simcounty(numCounties, minPop = 1000, maxPop = 100000,
          bin_minimums = c(0, 10000, 15000, 20000, 25000, 30000, 35000, 40000, 45000,
                           50000, 60000, 75000, 100000, 125000, 150000, 200000))

Arguments

numCounties

The number of counties to simulate data for

minPop

Minimum population to sample (default = 1000)

maxPop

Maximum population to sample (default = 100000)

bin_minimums

Bin edges. Defaults to the edges used in the Census data.

Value

Returns a list of two data frames:

county_bins

Simulated binned income data

county_true

Statistics computed from the raw data

Details

The county names will tell which distributions were sampled to simulate each county.

References

Paul T. von Hippel, David J. Hunter, McKalie Drown. Better Estimates from Binned Income Data: Interpolated CDFs and Mean-Matching, Sociological Science, November 15, 2017. https://www.sociologicalscience.com/articles-v4-26-641/

See Also

county_bins, county_true

Examples

Run this code
# NOT RUN {
l1 <- simcounty(5)
cb <- l1$county_bins
ct <- l1$county_true
sbl <- splinebins(cb$bin_max[cb$fips==103], cb$households[cb$fips==103],
                  ct$mean_true[ct$fips==103])
stl <- stepbins(cb$bin_max[cb$fips==105], cb$households[cb$fips==105],
                ct$mean_true[ct$fips==105])
plot(sbl$splinePDF, 0, 300000, n=500)
plot(stl$stepPDF, do.points=FALSE, main=cb$county[cb$fips==105][1])

## Simulate one county and estimate gini and theil from binned data
l2 <- simcounty(1)
binedges <- l2$county_bins$bin_max + 0.5 # continuity correction
bincounts <- l2$county_bins$households
splinefit <- splinebins(binedges, bincounts, l2$county_true$mean_true)
gini(splinefit)
theil(splinefit)
l2$county_true
# }

Run the code above in your browser using DataLab