Likedist: Likelihood Distance.

Description

A general model-based measure of case influence on model fit is likelihood distance (Cook, 1977, 1986; Cook & Weisberg, 1982) defined as

$$LD_i=2[L(\hat{\mathbf{\theta}})-L(\hat{\mathbf{\theta}}_{(i)})]$$

where $\hat{\mathbf{\theta}}$ and $\hat{\mathbf{\theta}}_{(i)}$ are the $k \times 1$ vectors of estimated model parameters on the original and deleted $i$ samples, respectively, where $i = 1, \ldots, N$. The subscript ($i$) indicates that the estimate was computed on the sample excluding case $i$. $L(\hat{\mathbf{\theta}})$ and $L(\hat{\mathbf{\theta}}_{(i)})$ are the log-likelihoods based on the original and the deleted $i$ samples, respectively.

Usage

Likedist(model, data, ...)

Value

Returns a vector of $LD_i$.

Arguments

model: A description of the user-specified model using the lavaan model syntax. See lavaan for more information.
data: A data frame containing the observed variables used in the model. If any variables are declared as ordered factors, this function will treat them as ordinal variables.
...: Additional parameters for sem function.

Author

Massimiliano Pastore, Gianmarco Altoe'

Details

The log-likelihoods $L(\hat{\mathbf{\theta}})$ and $L(\hat{\mathbf{\theta}}_{(i)})$ are computed by the function bollen.loglik using the formula 4B2 described by Bollen (1989, pag. 135).

The likelihood distance gives the amount by which the log-likelihood of the full data changes if one were to evaluate it at the reduced-data estimates. The important point is that $L(\hat{\mathbf{\theta}}_{(i)})$ is not the log-likelihood obtained by fitting the model to the reduced data set. It is obtained by evaluating the likelihood function based on the full data set (containing all $n$ observations) at the reduced-data estimates (Schabenberger, 2005).

References

Bollen, K.A. (1989). Structural Equations with latent Variables. New York, NY: Wiley.

Cook, R.D. (1977). Detection of influential observations in linear regression. Technometrics, 19, 15-18.

Cook, R.D. (1986). Assessment of local influence. Journal of the Royal Statistical Society B, 48, 133-169.

Cook, R.D., Weisberg, S. (1986). Residuals and influence in regressions. New York, NY: Chapman & Hall.

Pek, J., MacCallum, R.C. (2011). Sensitivity Analysis in Structural Equation Models: Cases and Their Influence. Multivariate Behavioral Research, 46, 202-228.

Schabenberger, O. (2005). Mixed model influence diagnostics. In SUGI, 29, 189-29. SAS institute Inc, Cary, NC.

Examples

Run this code

## not run: this example take several minutes
data("PDII")
model <- "
  F1 =~ y1+y2+y3+y4
"
# fit0 <- sem(model, data=PDII)
# LD <-Likedist(model,data=PDII)
# plot(LD,pch=19,xlab="observations",ylab="Likelihood distances")

## not run: this example take several minutes
## an example in which the deletion of a case yelds a solution 
## with negative estimated variances
model <- "
  F1 =~ x1+x2+x3
  F2 =~ y1+y2+y3+y4
  F3 =~ y5+y6+y7+y8
"

# fit0 <- sem(model, data=PDII)
# LD <-Likedist(model,data=PDII)
# plot(LD,pch=19,xlab="observations",ylab="Likelihood distances")

Run the code above in your browser using DataLab