broom (version 0.4.1)

poLCA_tidiers: Tidiers for poLCA objects

Description

Tidiers for poLCA latent class regression models. Summarize the probabilities of each outcome for each variable within each class with tidy, add predictions to the data with augment, or find the log-likelihood/AIC/BIC with glance.

Usage

"tidy"(x, ...)
"augment"(x, data, ...)
"glance"(x, ...)

Arguments

x
A poLCA object
...
Extra arguments, not used
data
For augment, the original dataset used to fit the latent class model. If not given, uses manifest variables in x$y and, if applicable, covariates in x$x

Value

All tidying methods return a data.frame without rownames, whose structure depends on the method chosen.tidy returns a data frame with one row per variable-class-outcome combination, with columns:
variable
Manifest variable
class
Latent class ID, an integer
outcome
Outcome of manifest variable
estimate
Estimated class-conditional response probability
std.error
Standard error of estimated probability
augment returns a data frame with one row for each original observation, augmented with the following columns:
.class
Predicted class, using modal assignment
.probability
Posterior probability of predicted class
If the data argument is given, those columns are included in the output (only rows for which predictions could be made). Otherwise, the y element of the poLCA object, which contains the manifest variables used to fit the model, are used, along with any covariates, if present, in x.Note that while the probability of all the classes (not just the predicted modal class) can be found in the posterior element, these are not included in the augmented output, since it would result in potentially many additional columns, which augment tends to avoid.glance returns a one-row data frame with the following columns:
logLik
the data's log-likelihood under the model
AIC
the Akaike Information Criterion
BIC
the Bayesian Information Criterion
g.squared
The likelihood ratio/deviance statistic
chi.squared
The Pearson Chi-Square goodness of fit statistic for multiway tables
df
Number of parameters estimated, and therefore degrees of freedom used
df.residual
Number of residual degrees of freedom left

Examples

Run this code

if (require("poLCA", quietly = TRUE)) {
  library(poLCA)
  library(dplyr)
  
  data(values)
  f <- cbind(A, B, C, D)~1
  M1 <- poLCA(f, values, nclass = 2, verbose = FALSE)
  
  M1
  tidy(M1)
  head(augment(M1))
  glance(M1)
  
  library(ggplot2)
  
  ggplot(tidy(M1), aes(factor(class), estimate, fill = factor(outcome))) +
    geom_bar(stat = "identity", width = 1) +
    facet_wrap(~ variable)
  
  set.seed(2016)
  # compare multiple
  mods <- data_frame(nclass = 1:3) %>%
    group_by(nclass) %>%
    do(mod = poLCA(f, values, nclass = .$nclass, verbose = FALSE))
  
  # compare log-likelihood and/or AIC, BIC
  mods %>%
    glance(mod)
  
  ## Three-class model with a single covariate.
  
  data(election)
  f2a <- cbind(MORALG,CARESG,KNOWG,LEADG,DISHONG,INTELG,
               MORALB,CARESB,KNOWB,LEADB,DISHONB,INTELB)~PARTY
  nes2a <- poLCA(f2a, election, nclass = 3, nrep = 5, verbose = FALSE)
  
  td <- tidy(nes2a)
  head(td)
  
  # show 
  
  ggplot(td, aes(outcome, estimate, color = factor(class), group = class)) +
    geom_line() +
    facet_wrap(~ variable, nrow = 2) +
    theme(axis.text.x = element_text(angle = 90, hjust = 1))
  
  au <- augment(nes2a)
  head(au)
  au %>%
    count(.class)
  
  # if the original data is provided, it leads to NAs in new columns
  # for rows that weren't predicted
  au2 <- augment(nes2a, data = election)
  head(au2)
  dim(au2)
}

Run the code above in your browser using DataLab