Learn R Programming

Methods for "A Fast Ecological Inference Algorithm for the RxC case".

The following library includes a method (run_em) to solve the R×C Ecological Inference problem for the non-parametric case by using the EM algorithm with different approximation methods for the E-Step. The standard deviation of the estimated probabilities can be computed using bootstrapping (bootstrap).

It also provides a function that generates synthetic election data (simulate_election) and a function that imports real election data (chilean_election_2021) from the Chilean first-round presidential election of 2021.

The setting in which the documentation presents the Ecological Inference problem is an election context where for a set of ballot-boxes we observe (i) the votes obtained by each candidate and (ii) the number of voters of each demographic group (for example, these can be defined by age ranges or sex). See Thraves, C., Ubilla, P., Hermosilla, D. (2024): "A Fast Ecological Inference Algorithm for the R×C Case"..

The methods to compute the conditional probabilities of the E-Step included in this package are the following:

  • Markov Chain Monte Carlo (mcmc): Performs MCMC to sample vote outcomes for each ballot-box consistent with the observed data. This sample is used to estimate the conditional probability of the E-Step.

  • Multivariate Normal PDF (mvn_pdf): Uses the PDF of a Multivariate Normal to approxi- mate the conditional probability.

  • Multivariate Normal CDF (mvn_cdf): Uses the CDF of a Multivariate Normal to approxi- mate the conditional probability.

  • Multinomial (mult): A single Multinomial is used to approximate the sum of Multinomial distributions.

  • Exact (exact): Solves the E-Step exactly using the Total Probability Law, which requires enumerating an exponential number of terms.

On average, the Multinomial method is the most efficient and precise. Its precision matches the Exact method.

The documentation uses the following notation:

  • b: number of ballot boxes.
  • g: number of demographic groups.
  • c: number of candidates.
  • a: number of aggregated macro-groups.

To learn more about fastei, please consult the available vignettes:

browseVignettes("fastei")

Aditionally, it is possible to browse the full documentation on the library website.

Installation

It can either be installed from source by running devtools::install_github("DanielHermosilla/ecological-inference-elections) or by the CRAN repository install.packages("fastei"). Fortran support is required, usually, shipped by R itself.

Copy Link

Version

Install

install.packages('fastei')

Monthly Downloads

258

Version

0.0.0.12

License

MIT + file LICENSE

Maintainer

Daniel Hermosilla

Last Published

January 10th, 2026

Functions in fastei (0.0.0.12)

waldtest

Performs a matrix-wise Wald test for two eim objects
save_eim

Save an eim object to a file
bootstrap

Runs a Bootstrap to Estimate the Standard Deviation of Predicted Probabilities
get_agg_opt

Runs the EM algorithm over all possible group aggregating, returning the one with higher likelihood while constraining the standard deviation of the probabilities.
run_em

Compute the Expected-Maximization Algorithm
fastei-package

fastei: Methods for "A Fast Ecological Inference Algorithm for the R\(\times\)C case"
get_agg_proxy

Runs the EM algorithm aggregating adjacent groups, maximizing the variability of macro-group allocation in ballot boxes.
eim

S3 Object for the Expectation-Maximization Algorithm
chile_election_2021

Chilean 2021 First Round Presidential Election
simulate_election

Simulate an Election
get_eim_chile

Extracts voting and demographic data matrices for a given electoral district in Chile.