clpm: Constrained Estimation of the Linear Probability Model

Description

clpm is used to fit the linear probability model while ensuring that the predicted probabilities are in the (0,1) interval. The function can also be applied to any variable for which predictions between 0 and 1 are required.

Usage

clpm(formula , data, subset, na.action, weights, contrasts = NULL,
lambda = NULL, control = clpm.control(), ...)

Value

clpm returns an object of class "clpm".

The functions summary and predict are used to obtain and print a summary and estimate model predictions.

An object of class “clpm”, a list containing the following items:

coefficients: a named vector of coefficients.
covar: the estimated variance-covariance matrix.
residuals: the residuals, that is the response minus the fitted values.
rank: the numeric rank of the fitted linear model.
fitted.values: the fitted values, that represent conditional means or, for a binary response, conditional probabilities.
weights: (only for weighted fits) the specified weights.
df.residuals: the residual degrees of freedom.
obj.function: the value of the minimized loss function.
gradient: the value of the gradient.
convergence: logical. The convergence status.
n.it: the number of iterations.
control: the values from clpm.control.
lambda: the lambda value applied for model estimation.
contrasts: (only where relevant) the contrasts used.
xlevels: (only where relevant) a record of the levels of the factors used in fitting.
call: the matched call.
terms: the terms object used.
model: if requested (the default), the model frame used.

Arguments

formula: a two-sided formula of the form y ~ x1 + x2 + ...: a symbolic description of the linear probability model. The y argument must be a variable (binary or continous) for which we want predictions to be inside the (0,1) interval. The model specification is exactly as in lm.
data: an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which clpm is called.
subset: an optional vector specifying a subset of observations to be used in the fitting process.
na.action: a function which indicates what should happen when the data contain NAs. See lm for details.
weights: an optional vector of weights to be used in the fitting process. Should be NULL or a numeric vector. See lm.
contrasts: an optional list. See the contrasts.arg of model.matrix.default.
lambda: a tuning constant that defines how important it is to obtain predictions in the (0,1) interval. If lambda is too small, the constraints may not be respected. On the other hand, if lambda is too large, the objective function might lose its convexity. If no value is supplied, an optimal value will be selected iteratively.
control: see clpm.control.
...: for future arguments.

Author

Andrea Beci andreabeci08@gmail.com, Paolo Frumento paolo.frumento@unipi.it

Details

For more details, see lm.

Examples

Run this code

x <- runif(100)
y <- rbinom(100, 1, x)
fit <- clpm(y~x)

summary(fit)
predict(fit)

Run the code above in your browser using DataLab