logit.EM: Logistic Regression Expectation Maximization

Description

Expectation maximization for logistic regression.

Usage

logit.EM(y, X, n=rep(1,length(y)), tol=1e-9, max.iter=100)

Arguments

An N dimensional vector; $y_i$ is the average response at $x_i$.

An N x P dimensional design matrix; $x_i$ is the ith row.

An N dimensional vector; n_i is the number of observations at each $x_i$.

tol

Threshold at which algorithm stops.

max.iter

Maximum number of iterations.

Value

beta

The posterior mode.

iter

The number of iterations.

Details

Logistic regression is a classification mechanism. Given the binary data $\{y_i\}$ and the p-dimensional predictor variables $\{x_i\}$, one wants to forecast whether a future data point y* observed at the predictor x* will be zero or one. Logistic regression stipulates that the statistical model for observing a success=1 or failure=0 is governed by

$$ P(y^* = 1 | x^*, \beta) = (1 + \exp(-x^* \beta))^{-1}. $$

Instead of representing data as a collection of binary outcomes, one may record the average response $y_i$ at each unique $x_i$ given a total number of $n_i$ observations at $x_i$. We follow this method of encoding data.

A non-informative prior is used.

References

Nicholas G. Polson, James G. Scott, and Jesse Windle. Bayesian inference for logistic models using Polya-Gamma latent variables. http://arxiv.org/abs/1205.0310

Nicholas G. Poslon and James G. Scott. Default Bayesian analysis for multi-way tables: a data-augmentation approach. http://arxiv.org/pdf/1109.4180

Examples

Run this code

# NOT RUN {
## From UCI Machine Learning Repository.
data(spambase);

## A subset of the data.
sbase = spambase[seq(1,nrow(spambase),10),];

X = model.matrix(is.spam ~ word.freq.free + word.freq.1999, data=sbase);
y = sbase$is.spam;

## Run logistic regression.
output = logit.EM(y, X);

# }

Run the code above in your browser using DataLab