Learn R Programming

binsmooth (version 0.2.2)

stats_from_distribution: Estimate various statistics

Description

Estimates the mean, variance, standard deviation, Gini coefficient, and Theil index from a smoothed distribution.

Usage

stats_from_distribution(binFit)

Arguments

binFit

A list as returned by splinebins, stepbins, or rsubbins. (Alternatively, a list containing a PDF of non-negative support, its CDF, and an upper bound for the support of the PDF.)

Value

A vector of five statistics.

Details

The mean and variance are calculated from the CDF. For details on the other statistics, see gini and theil.

References

Paul T. von Hippel, David J. Hunter, McKalie Drown. Better Estimates from Binned Income Data: Interpolated CDFs and Mean-Matching, Sociological Science, November 15, 2017. https://www.sociologicalscience.com/articles-v4-26-641/

Examples

Run this code
# NOT RUN {
# 2005 ACS data from Cook County, Illinois
binedges <- c(10000,15000,20000,25000,30000,35000,40000,45000,
              50000,60000,75000,100000,125000,150000,200000,NA)
bincounts <- c(157532,97369,102673,100888,90835,94191,87688,90481,
               79816,153581,195430,240948,155139,94527,92166,103217)
stepfit <- stepbins(binedges, bincounts, 76091)
splinefit <- splinebins(binedges, bincounts, 76091)
stats_from_distribution(stepfit)
stats_from_distribution(splinefit) # More accurate
# }

Run the code above in your browser using DataLab