qdap (version 1.2.0)

automated_readability_index: Readability Measures

Description

automated_readability_index - Apply Automated Readability Index to transcript(s) by zero or more grouping variable(s). coleman_liau - Apply Coleman Liau Index to transcript(s) by zero or more grouping variable(s). SMOG - Apply SMOG Readability to transcript(s) by zero or more grouping variable(s). flesch_kincaid - Flesch-Kincaid Readability to transcript(s) by zero or more grouping variable(s). fry - Apply Fry Readability to transcript(s) by zero or more grouping variable(s). linsear_write - Apply Linsear Write Readability to transcript(s) by zero or more grouping variable(s).

Usage

automated_readability_index(text.var, grouping.var = NULL,
  rm.incomplete = FALSE, ...)

coleman_liau(text.var, grouping.var = NULL, rm.incomplete = FALSE, ...)

SMOG(text.var, grouping.var = NULL, output = "valid",
  rm.incomplete = FALSE, ...)

flesch_kincaid(text.var, grouping.var = NULL, rm.incomplete = FALSE, ...)

fry(text.var, grouping.var = NULL, rm.incomplete = FALSE,
  auto.label = TRUE, grid = FALSE, div.col = "grey85", plot = TRUE, ...)

linsear_write(text.var, grouping.var = NULL, rm.incomplete = FALSE, ...)

Arguments

text.var
The text variable.
grouping.var
The grouping variables. Default NULL generates one output for all text. Also takes a single grouping variable or a list of 1 or more grouping variables.
rm.incomplete
logical. If TRUE removes incomplete sentences from the analysis.
...
Other arguments passed to end_inc.
output
A character vector character string indicating output type. One of "valid" (default and congruent with McLaughlin's intent) or "all".
auto.label
logical. If TRUE labels automatically added. If FALSE the user clicks interactively.
grid
logical. If TRUE a micro grid is displayed, similar to Fry's original depiction, though this may make visualizing more difficult.
div.col
The color of the grade level division lines.
plot
logical. If TRUE a graph is plotted corresponding to Fry's graphic representation.

Value

  • Returns a list of 2 dataframes: (1) Counts and (2) Readability. Counts are the raw scores used to calculate readability score and can be accessed via counts. Readability is the dataframe with the selected readability statistic by grouping variable(s) and can be access via scores. The fry function returns a graphic representation of the readability as the scores returns the information for graphing but not a readability score.

Warning

Many of the indices (e.g., Automated Readability Index) are derived from word difficulty (letters per word) and sentence difficulty (words per sentence). If you have not run the sentSplit function on your data the results may not be accurate.

References

Coleman, M., & Liau, T. L. (1975). A computer readability formula designed for machine scoring. Journal of Applied Psychology, Vol. 60, pp. 283-284. Flesch R. (1948). A new readability yardstick. Journal of Applied Psychology. Vol. 32(3), pp. 221-233. doi: 10.1037/h0057532. Gunning, T. G. (2003). Building Literacy in the Content Areas. Boston: Allyn & Bacon. McLaughlin, G. H. (1969). SMOG Grading: A New Readability Formula. Journal of Reading, Vol. 12(8), pp. 639-646. Senter, R. J., & Smith, E. A.. (1967) Automated readability index. Technical Report AMRLTR-66-220, University of Cincinnati, Cincinnati, Ohio.

Examples

Run this code
AR1 <- with(rajSPLIT, automated_readability_index(dialogue, list(person, act)))
ltruncdf(AR1,, 15)
scores(AR1)
counts(AR1)
plot(AR1)
plot(counts(AR1))

AR2 <- with(rajSPLIT, automated_readability_index(dialogue, list(sex, fam.aff)))
ltruncdf(AR2,, 15)
scores(AR2)
counts(AR2)
plot(AR2)
plot(counts(AR2))

AR3 <- with(rajSPLIT, automated_readability_index(dialogue, person))
ltruncdf(AR3,, 15)
scores(AR3)
head(counts(AR3))
plot(AR3)
plot(counts(AR3))

CL1 <- with(rajSPLIT, coleman_liau(dialogue, list(person, act)))
ltruncdf(CL1, 20)
head(counts(CL1))
plot(CL1)

CL2 <- with(rajSPLIT, coleman_liau(dialogue, list(sex, fam.aff)))
ltruncdf(CL2)
plot(counts(CL2))

(SM1 <- with(rajSPLIT, SMOG(dialogue, list(person, act))))
plot(counts(SM1))
plot(SM1)

(SM2 <- with(rajSPLIT, SMOG(dialogue, list(sex, fam.aff))))

(FL1 <- with(rajSPLIT, flesch_kincaid(dialogue, list(person, act))))
plot(scores(FL1))
plot(counts(FL1))

(FL2 <-  with(rajSPLIT, flesch_kincaid(dialogue, list(sex, fam.aff))))
plot(scores(FL2))
plot(counts(FL2))

FR1 <- with(rajSPLIT, fry(dialogue, list(sex, fam.aff)))
scores(FR1)
plot(scores(FR1))
counts(FR1)
plot(counts(FR1))

FR2 <- with(rajSPLIT, fry(dialogue, person))
scores(FR2)
plot(scores(FR2))
counts(FR2)
plot(counts(FR2))

FR3 <- with(pres_debates2012, fry(dialogue, list(time, person)))
colsplit2df(scores(FR3))
plot(scores(FR3), auto.label = FALSE)
counts(FR3)
plot(counts(FR3))

library(ggplot2)
ggplot(colsplit2df(counts(FR3)), aes(sent.per.100.wrds,
    syllables.per.100.wrds)) +
    geom_point(aes(fill=person), shape=21, size=3) +
    facet_grid(person~time)

LW1 <- with(rajSPLIT, linsear_write(dialogue, list(person, act)))
plot(scores(LW1))
plot(counts(LW1))

LW2 <- with(rajSPLIT, linsear_write(dialogue, list(sex, fam.aff)))
plot(scores(LW2), method="lm")
plot(counts(LW2))

Run the code above in your browser using DataCamp Workspace