Learn R Programming

BayesLogit (version 0.3)

logit.combine: Collapse Data for Binomial Logistic Regression

Description

Collapse data for binomial logistic regression.

Usage

logit.combine(y, X, n=rep(1,length(y)))

Arguments

y
An N dimensional vector; $y_i$ is the average response at $x_i$.
X
An N x P dimensional design matrix; $x_i$ is the ith row.
n
An N dimensional vector; n_i is the number of observations at each $x_i$.

Value

  • logit.combine returns a list.
  • yThe new response.
  • XThe new design matrix.
  • nThe number of samples at each revised observation.

Details

Logistic regression is a classification mechanism. Given the binary data ${y_i}$ and the p-dimensional predictor variables ${x_i}$, one wants to forecast whether a future data point y* observed at the predictor x* will be zero or one. Logistic regression stipulates that the statistical model for observing a success=1 or failure=0 is governed by

$$P(y^* = 1 | x^*, \beta) = (1 + \exp(-x^* \beta))^{-1}.$$

Instead of representing data as a collection of binary outcomes, one may record the average response $y_i$ at each unique $x_i$ given a total number of $n_i$ observations at $x_i$.

Thus, when a predictor is repeated the two reponses may be collapsed into a single observation representing multiple trials. This function collapses data in this way.

See Also

logit, logit.EM, mlogit

Examples

Run this code
## From UCI Machine Learning Repository.
data(spambase);

## A subset of the data.
sbase = spambase[seq(1,nrow(spambase),10),];

X = model.matrix(is.spam ~ word.freq.free + word.freq.1999, data=sbase);
y = sbase$is.spam;

## Actually unnecessary as logit.EM automatically tries to compress.
new.data = logit.combine(y, X)
mode.spam = logit.EM(new.data$y, new.data$X, new.data$n)
mode.spam

Run the code above in your browser using DataLab