logitreg: Logistic regression models for assessing analogues/non-analogues

Description

Fits logistic regression models to each level of group to model the probability of two samples being analogues conditional upon the dissimilarity between the two samples.

Usage

logitreg(object, groups, k = 1, ...)
## S3 method for class 'default':
logitreg(object, groups, k = 1, ...)
## S3 method for class 'analog':
logitreg(object, groups, k = 1, ...)
## S3 method for class 'logitreg':
summary(object, p = 0.9, ...)

Arguments

object

for logitreg; a full dissimilarity matrix. For summary.logitreg an object of class "logitreg", the result of a call to logitreg.

groups

factor (or object that can be coerced to one) containing the group membership for each sample in object.

numeric; the k closest analogues to use in the model fitting.

probability at which to predict the dose needed.

...

arguments passed to other methods.

Value

logitreg returns an object of class "logitreg"; a list whose components are objects returned by glm. See glm for further details on the returned objects.
The components of this list take their names from group.
For summary.logitreg an object of class "summary.logitreg", a data frame with summary statistics of the model fits. The components of this data frame are:
In, OutThe number of analogue and non-analogue dissimilarities analysed in each group,
Est.(Dij), Std.ErrCoefficient and its standard error for dissimilarity from the logit model,
Z-value, p-valueWald statistic and associated p-value for each logit model.
Dij(p=?), Std.Err(Dij)The dissimilarity at which the posterior probability of two samples being analogues is equal to $p$, and its standard error. These are computed using dose.p.

Details

Fits logistic regression models to each level of group to model the probability of two samples being analogues (i.e. in the same group) conditional upon the dissimilarity between the two samples.

This function can be seen as a way of directly modelling the probability that two sites are analogues, conditional upon dissimilarity, that can also be done less directly using roc and bayesF.

Examples

Run this code

## load the example data
data(swapdiat, swappH, rlgh)

## merge training and test set on columns
dat <- join(swapdiat, rlgh, verbose = TRUE)

## extract the merged data sets and convert to proportions
swapdiat <- dat[[1]] / 100
rlgh <- dat[[2]] / 100

## fit an analogue matching (AM) model using the squared chord distance
## measure - need to keep the training set dissimilarities
swap.ana <- analog(swapdiat, rlgh, method = "SQchord",
                   keep.train = TRUE)

## fit the ROC curve to the SWAP diatom data using the AM results
## Generate a grouping for the SWAP lakes
clust <- hclust(as.dist(swap.ana$train), method = "ward")
grps <- cutree(clust, 6)

## fit the logit models to the analog object
swap.lrm <- logitreg(swap.ana, grps)
swap.lrm

## summary statistics
summary(swap.lrm)

## plot the fitted logit curves
plot(swap.lrm, conf.type = "polygon")

## extract fitted posterior probabilities for training samples
## for the individual groups
fit <- fitted(swap.lrm)
head(fit)

## compute posterior probabilities of analogue-ness for the rlgh
## samples. Here we take the dissimilarities between fossil and
## training samples from the `swap.ana` object rather than re-
## compute them
pred <- predict(swap.lrm, newdata = swap.ana$analogs)
head(pred)

Run the code above in your browser using DataLab