readability: Measure readability

Description

These methods calculate several readability indices.

Usage

readability(txt.file, ...)
# S4 method for kRp.text
readability(
  txt.file,
  hyphen = NULL,
  index = c("ARI", "Bormuth", "Coleman", "Coleman.Liau", "Dale.Chall",
    "Danielson.Bryan", "Dickes.Steiwer", "DRP", "ELF", "Farr.Jenkins.Paterson", "Flesch",
    "Flesch.Kincaid", "FOG", "FORCAST", "Fucks", "Gutierrez", "Harris.Jacobson",
    "Linsear.Write", "LIX", "nWS", "RIX", "SMOG", "Spache", "Strain", "Traenkle.Bailer",
    "TRI", "Tuldava", "Wheeler.Smith"),
  parameters = list(),
  word.lists = list(Bormuth = NULL, Dale.Chall = NULL, Harris.Jacobson = NULL, Spache =
    NULL),
  fileEncoding = "UTF-8",
  sentc.tag = "sentc",
  nonword.class = "nonpunct",
  nonword.tag = c(),
  quiet = FALSE,
  keep.input = NULL,
  as.feature = FALSE
)
# S4 method for missing
readability(txt.file, index)
# S4 method for kRp.readability,ANY,ANY,ANY
[(x, i)
# S4 method for kRp.readability
[[(x, i)

Arguments

txt.file

An object of class kRp.text.

...

Additional arguments for the generics.

hyphen

An object of class kRp.hyphen. If NULL, the text will be hyphenated automatically. All syllable handling will be skipped automatically if it's not needed for the selected indices.

index

A character vector, indicating which indices should actually be computed. If set to "all", then all available indices will be tried (meaning all variations of all measures). If set to "fast", a subset of the default values is used that is known to compute fast (currently, this only excludes "FOG"). You can also set it to "validation" to get information on the current status of validation.

parameters

A list with named magic numbers, defining the relevant parameters for each index. If none are given, the default values are used.

word.lists

A named list providing the word lists for indices which need one. If NULL or missing, the indices will be skipped and a warning is giving. Actual word lists can be provided as either a vector (or matrix or data.frame with only one column), or as a file name, where this file must contain one word per line. Alternatively, you can provide the number of words which are not on the list, directly.

fileEncoding

A character string defining the character encoding of the word.lists in case they are provided as files, like "Latin1" or "UTF-8".

sentc.tag

A character vector with POS tags which indicate a sentence ending. The default value "sentc" has special meaning and will cause the result of kRp.POS.tags(lang, tags="sentc", list.tags=TRUE) to be used.

nonword.class

A character vector with word classes which should be ignored for readability analysis. The default value "nonpunct" has special meaning and will cause the result of kRp.POS.tags(lang, tags=c("punct","sentc"), list.classes=TRUE) to be used. Will only be of consequence if hyphen is not set!

nonword.tag

A character vector with POS tags which should be ignored for readability analysis. Will only be of consequence if hyphen is not set!

quiet

Logical. If FALSE, short status messages will be shown. TRUE will also suppress all potential warnings regarding the validation status of measures.

keep.input

Logical. If FALSE, neither the object provided by (or generated from) txt.file nor hyphen will be kept in the output object. By default (NULL) they are kept if the input was not already of the needed object class (e.g., kRp.text) or missing, to allow for re-use without the need to tag or hyphenate the text again. If TRUE, they are always kept. In cases where you want smaller object sizes, set this to FALSE to always drop these slots.

as.feature

Logical, whether the output should be just the analysis results or the input object with the results added as a feature. Use corpusReadability to get the results from such an aggregated object.

An object of class kRp.readability.

Defines the row selector ([) or the name to match ([[).

Value

Depending on as.feature, either an object of class kRp.readability, or an object of class kRp.text with the added feature readability containing it.

Details

In the following formulae, $W$ stands for the number of words, $St$ for the number of sentences, $C$ for the number of characters (usually meaning letters), $Sy$ for the number of syllables, $W_{3Sy}$ for the number of words with at least three syllables, $W_{<3Sy}$ for the number of words with less than three syllables, $W^{1Sy}$ for words with exactly one syllable, $W_{6C}$ for the number of words with at least six letters, and $W_{-WL}$ for the number of words which are not on a certain word list (explained where needed).

"ARI":

Automated Readability Index: $$ARI = 0.5 \times \frac{W}{St} + 4.71 \times \frac{C}{W} - 21.43$$ If parameters is set to ARI="NRI", the revised parameters from the Navy Readability Indexes are used: $$ARI_{NRI} = 0.4 \times \frac{W}{St} + 6 \times \frac{C}{W} - 27.4$$ If parameters is set to ARI="simple", the simplified formula is calculated: $$ARI_{simple} = \frac{W}{St} + 9 \times \frac{C}{W}$$

Wrapper function: ARI

"Bormuth":

Bormuth Mean Cloze & Grade Placement: $$ B_{MC} = 0.886593 - \left( 0.08364 \times \frac{C}{W} \right) + 0.161911 \times \left(\frac{W_{-WL}}{W} \right)^3 $$ $$ - 0.21401 \times \left(\frac{W}{St} \right) + 0.000577 \times \left(\frac{W}{St} \right)^2 $$ $$ - 0.000005 \times \left(\frac{W}{St} \right)^3 $$ Note: This index needs the long Dale-Chall list of 3000 familiar (english) words to compute $W_{-WL}$. That is, you must have a copy of this word list and provide it via the word.lists=list(Bormuth=<your.list>) parameter! $$ B_{GP} = 4.275 + 12.881 \times B_{MC} - (34.934 \times B_{MC}^2) + (20.388 \times B_{MC}^3) $$ $$ + (26.194C - 2.046 C_{CS}^2) - (11.767 C_{CS}^3) - (44.285 \times B_{MC} \times C_{CS}) $$ $$ + (97.620 \times (B_{MC} \times C_{CS})^2) - (59.538 \times (B_{MC} \times C_{CS})^3)$$ Where $C_{CS}$ represents the cloze criterion score (35% by default).

Wrapper function: bormuth

"Coleman":

Coleman's Readability Formulas: $$C_1 = 1.29 \times \left( \frac{100 \times W^{1Sy}}{W} \right) - 38.45$$ $$C_2 = 1.16 \times \left( \frac{100 \times W^{1Sy}}{W} \right) + 1.48 \times \left( \frac{100 \times St}{W} \right) - 37.95$$ $$C_3 = 1.07 \times \left( \frac{100 \times W^{1Sy}}{W} \right) + 1.18 \times \left( \frac{100 \times St}{W} \right) + 0.76 \times \left( \frac{100 \times W_{pron}}{W} \right) - 34.02$$ $$C_4 = 1.04 \times \left( \frac{100 \times W^{1Sy}}{W} \right) + 1.06 \times \left( \frac{100 \times St}{W} \right) \\ + 0.56 \times \left( \frac{100 \times W_{pron}}{W} \right) - 0.36 \times \left( \frac{100 \times W_{prep}}{W} \right) - 26.01$$ Where $W_{pron}$ is the number of pronouns, and $W_{prep}$ the number of prepositions.

Wrapper function: coleman

"Coleman.Liau":

First estimates cloze percentage, then calculates grade equivalent: $$CL_{ECP} = 141.8401 - 0.214590 \times \frac{100 \times C}{W} + 1.079812 \times \frac{100 \times St}{W}$$ $$CL_{grade} = -27.4004 \times \frac{CL_{ECP}}{100} + 23.06395$$ The short form is also calculated: $$CL_{short} = 5.88 \times \frac{C}{W} - 29.6 \times \frac{St}{W} - 15.8$$

Wrapper function: coleman.liau

"Dale.Chall":

New Dale-Chall Readability Formula. By default the revised formula (1995) is calculated: $$DC_{new} = 64 - 0.95 \times{} \frac{100 \times{} W_{-WL}}{W} - 0.69 \times{} \frac{W}{St} $$ This will result in a cloze score which is then looked up in a grading table. If parameters is set to Dale.Chall="old", the original formula (1948) is used: $$DC_{old} = 0.1579 \times{} \frac{100 \times{} W_{-WL}}{W} + 0.0496 \times{} \frac{W}{St} + 3.6365 $$ If parameters is set to Dale.Chall="PSK", the revised parameters by Powers-Sumner-Kearl (1958) are used: $$DC_{PSK} = 0.1155 \times{} \frac{100 \times{} W_{-WL}}{W} + 0.0596 \times{} \frac{W}{St} + 3.2672 $$ Note: This index needs the long Dale-Chall list of 3000 familiar (english) words to compute $W_{-WL}$. That is, you must have a copy of this word list and provide it via the word.lists=list(Dale.Chall=<your.list>) parameter!

Wrapper function: dale.chall

"Danielson.Bryan":

$$DB_1 = \left( 1.0364 \times \frac{C}{Bl} \right) + \left( 0.0194 \times \frac{C}{St} \right) - 0.6059$$ $$DB_2 = 131.059 - \left( 10.364 \times \frac{C}{Bl} \right) - \left( 0.194 \times \frac{C}{St} \right)$$ Where $Bl$ means blanks between words, which is not really counted in this implementation, but estimated by $words - 1$. $C$ is interpreted as literally all characters.

Wrapper function: danielson.bryan

"Dickes.Steiwer":

Dickes-Steiwer Handformel: $$DS = 235.95993 - \left( 73.021 \times \frac{C}{W} \right) - \left(12.56438 \times \frac{W}{St} \right) - \left(50.03293 \times TTR \right)$$ Where $TTR$ refers to the type-token ratio, which will be calculated case-insensitive by default.

Wrapper function: dickes.steiwer

"DRP":

Degrees of Reading Power. Uses the Bormuth Mean Cloze Score: $$DRP = (1 - B_{MC}) \times 100$$ This formula itself has no parameters. Note: The Bormuth index needs the long Dale-Chall list of 3000 familiar (english) words to compute $W_{-WL}$. That is, you must have a copy of this word list and provide it via the word.lists=list(Bormuth=<your.list>) parameter! Wrapper function: DRP

"ELF":

Fang's Easy Listening Formula: $$ELF = \frac{W_{2Sy}}{St}$$

Wrapper function: ELF

"Farr.Jenkins.Paterson":

A simplified version of Flesch Reading Ease: $$FJP = -31.517 - 1.015 \times \frac{W}{St} + 1.599 \times \frac{W^{1Sy}}{W}$$ If parameters is set to Farr.Jenkins.Paterson="PSK", the revised parameters by Powers-Sumner-Kearl (1958) are used: $$FJP_{PSK} = 8.4335 + 0.0923 \times \frac{W}{St} - 0.0648 \times \frac{W^{1Sy}}{W}$$ Wrapper function: farr.jenkins.paterson

"Flesch":

Flesch Reading Ease: $$F_{EN} = 206.835 - 1.015 \times \frac{W}{St} - 84.6 \times \frac{Sy}{W}$$ Certain internationalisations of the parameters are also implemented. They can be used by setting the Flesch parameter to one of the following language abbreviations.

"de" (Amstad's Verst<U+00E4>ndlichkeitsindex): $$F_{DE} = 180 - \frac{W}{St} - 58.5 \times \frac{Sy}{W}$$ "es" (Fernandez-Huerta): $$F_{ES} = 206.835 - 1.02 \times \frac{W}{St} - 60 \times \frac{Sy}{W}$$ "es-s" (Szigriszt): $$F_{ES S} = 206.835 - \frac{W}{St} - 62.3 \times \frac{Sy}{W}$$ "nl" (Douma): $$F_{NL} = 206.835 - 0.93 \times \frac{W}{St} - 77 \times \frac{Sy}{W}$$ "nl-b" (Brouwer Leesindex): $$F_{NL B} = 195 - 2 \times \frac{W}{St} - 67 \times \frac{Sy}{W}$$ "fr" (Kandel-Moles): $$F_{FR} = 209 - 1.15 \times \frac{W}{St} - 68 \times \frac{Sy}{W}$$ If parameters is set to Flesch="PSK", the revised parameters by Powers-Sumner-Kearl (1958) are used to calculate a grade level: $$F_{PSK} = 0.0778 \times \frac{W}{St} + 4.55 \times \frac{Sy}{W} - 2.2029$$

Wrapper function: flesch

"Flesch.Kincaid":

Flesch-Kincaid Grade Level: $$FK = 0.39 \times \frac{W}{St} + 11.8 \times \frac{Sy}{W} - 15.59$$

Wrapper function: flesch.kincaid

"FOG":

Gunning Frequency of Gobbledygook: $$FOG = 0.4 \times \left( \frac{W}{St} + \frac{100 \times W_{3Sy}}{W} \right)$$ If parameters is set to FOG="PSK", the revised parameters by Powers-Sumner-Kearl (1958) are used: $$FOG_{PSK} = 3.0680 + \left( 0.0877 \times \frac{W}{St} \right) + \left(0.0984 \times \frac{100 \times W_{3Sy}}{W} \right)$$ If parameters is set to FOG="NRI", the new FOG count from the Navy Readability Indexes is used: $$FOG_{new} = \frac{\frac{W_{<3Sy} + (3 * W_{3Sy})}{\frac{100 \times St}{W}} - 3}{2}$$ If the text was POS-tagged accordingly, proper nouns and combinations of only easy words will not be counted as hard words, and the syllables of verbs ending in "-ed", "-es" or "-ing" will be counted without these suffixes.

Due to the need to re-hyphenate combined words after splitting them up, this formula takes considerably longer to compute than most others. If will be omitted if you set index="fast" instead of the default.

Wrapper function: FOG

"FORCAST":

$$FORCAST = 20 - \frac{W^{1Sy} \times \frac{150}{W}}{10}$$ If parameters is set to FORCAST="RGL", the parameters for the precise reading grade level are used (see Klare, 1975, pp. 84--85): $$FORCAST_{RGL} = 20.43 - 0.11 \times W^{1Sy} \times \frac{150}{W}$$

Wrapper function: FORCAST

"Fucks":

Fucks' Stilcharakteristik (Fucks, 1955, as cited in Briest, 1974): $$Fucks = \frac{Sy}{W} \times \frac{W}{St}$$ This simple formula has no parameters.

Wrapper function: fucks

"Gutierrez":

Guti<U+00E9>rrez de Polini's F<U+00F3>rmula de comprensibilidad (Guti<U+00E9>rrez, 1972, as cited in Fern<U+00E1>ndez, 2016) for Spanish: $$Gutierrez = 95.2 - \frac{9.7 \times C}{W} - \frac{0.35 \times W}{St}$$

Wrapper function: gutierrez

"Harris.Jacobson":

Revised Harris-Jacobson Readability Formulas (Harris & Jacobson, 1974): For primary-grade material: $$HJ_1 = 0.094 \times \frac{100 \times{} W_{-WL}}{W} + 0.168 \times \frac{W}{St} + 0.502$$ For material above third grade: $$HJ_2 = 0.140 \times \frac{100 \times{} W_{-WL}}{W} + 0.153 \times \frac{W}{St} + 0.560$$ For material below forth grade: $$HJ_3 = 0.158 \times \frac{W}{St} + 0.055 \times \frac{100 \times{} W_{6C}}{W} + 0.355$$ For material below forth grade: $$HJ_4 = 0.070 \times \frac{100 \times{} W_{-WL}}{W} + 0.125 \times \frac{W}{St} + 0.037 \times \frac{100 \times{} W_{6C}}{W} + 0.497$$ For material above third grade: $$HJ_5 = 0.118 \times \frac{100 \times{} W_{-WL}}{W} + 0.134 \times \frac{W}{St} + 0.032 \times \frac{100 \times{} W_{6C}}{W} + 0.424$$ Note: This index needs the short Harris-Jacobson word list for grades 1 and 2 (english) to compute $W_{-WL}$. That is, you must have a copy of this word list and provide it via the word.lists=list(Harris.Jacobson=<your.list>) parameter!

Wrapper function: harris.jacobson

"Linsear.Write" (O'Hayre, undated, see Klare, 1975, p. 85):

$$LW_{raw} = \frac{100 - \frac{100 \times W_{<3Sy}}{W} + \left( 3 \times \frac{100 \times W_{3Sy}}{W} \right)}{\frac{100 \times St}{W}} $$ $$LW(LW_{raw} \leq 20) = \frac{LW_{raw} - 2}{2}$$ $$LW(LW_{raw} > 20) = \frac{LW_{raw}}{2}$$

Wrapper function: linsear.write

"LIX"

Bj<U+00F6>rnsson's L<U+00E4>sbarhetsindex. Originally proposed for Swedish texts, calculated by: $$LIX = \frac{W}{St} + \frac{100 \times{} W_{7C}}{W}$$ Texts with a LIX < 25 are considered very easy, around 40 normal, and > 55 very difficult to read.

Wrapper function: LIX

"nWS":

Neue Wiener Sachtextformeln (Bamberger & Vanecek, 1984): $$nWS_1 = 19.35 \times \frac{W_{3Sy}}{W} + 0.1672 \times \frac{W}{St} + 12.97 \times \frac{W_{6C}}{W} - 3.27 \times \frac{W^{1Sy}}{W} - 0.875$$ $$nWS_2 = 20.07 \times \frac{W_{3Sy}}{W} + 0.1682 \times \frac{W}{St} + 13.73 \times \frac{W_{6C}}{W} - 2.779$$ $$nWS_3 = 29.63 \times \frac{W_{3Sy}}{W} + 0.1905 \times \frac{W}{St} - 1.1144$$ $$nWS_4 = 27.44 \times \frac{W_{3Sy}}{W} + 0.2656 \times \frac{W}{St} - 1.693$$

Wrapper function: nWS

"RIX"

Anderson's Readability Index. A simplified version of LIX: $$RIX = \frac{W_{7C}}{St}$$ Texts with a RIX < 1.8 are considered very easy, around 3.7 normal, and > 7.2 very difficult to read.

Wrapper function: RIX

"SMOG":

Simple Measure of Gobbledygook. By default calculates formula D by McLaughlin (1969): $$SMOG = 1.043 \times \sqrt{W_{3Sy} \times \frac{30}{St}} + 3.1291$$ If parameters is set to SMOG="C", formula C will be calculated: $$SMOG_{C} = 0.9986 \times \sqrt{W_{3Sy} \times \frac{30}{St} + 5} + 2.8795$$ If parameters is set to SMOG="simple", the simplified formula is used: $$SMOG_{simple} = \sqrt{W_{3Sy} \times \frac{30}{St}} + 3$$ If parameters is set to SMOG="de", the formula adapted to German texts ("Qu", Bamberger & Vanecek, 1984, p. 78) is used: $$SMOG_{de} = \sqrt{W_{3Sy} \times \frac{30}{St}} - 2$$

Wrapper function: SMOG

"Spache":

Spache Revised Formula (1974): $$Spache = 0.121 \times \frac{W}{St} + 0.082 \times{} \frac{100 \times{} W_{-WL}}{W} + 0.659$$ If parameters is set to Spache="old", the original parameters (Spache, 1953) are used: $$Spache_{old} = 0.141 \times \frac{W}{St} + 0.086 \times{} \frac{100 \times{} W_{-WL}}{W} + 0.839$$ Note: The revised index needs the revised Spache word list (see Klare, 1975, p. 73), and the old index the short Dale-Chall list of 769 familiar (english) words to compute $W_{-WL}$. That is, you must have a copy of this word list and provide it via the word.lists=list(Spache=<your.list>) parameter!

Wrapper function: spache

"Strain":

Strain Index. This index was proposed in [1]: $$S = Sy \times{} \frac{1}{St / 3} \times{} \frac{1}{10}$$

Wrapper function: strain

"Traenkle.Bailer":

Tr<U+00E4>nkle-Bailer Formeln. These two formulas were the result of a re-examination of the ones proposed by Dickes-Steiwer. They try to avoid the usage of the type-token ratio, which is dependent on text length (Tr<U+00E4>nkle & Bailer, 1984): $$TB1 = 224.6814 - \left(79.8304 \times \frac{C}{W} \right) - \left(12.24032 \times \frac{W}{St} \right) - \left(1.292857 \times \frac{100 \times{} W_{prep}}{W} \right)$$ $$TB2 = 234.1063 - \left(96.11069 \times \frac{C}{W} \right) - \left(2.05444 \times \frac{100 \times{} W_{prep}}{W} \right) - \left(1.02805 \times \frac{100 \times{} W_{conj}}{W} \right)$$ Where $W_{prep}$ refers to the number of prepositions, and $W_{conj}$ to the number of conjunctions.

Wrapper function: traenkle.bailer

"TRI":

Kuntzsch's Text-Redundanz-Index. Intended mainly for German newspaper comments. $$TRI = \left(0.449 \times W^{1Sy}\right) - \left(2.467 \times Ptn\right) - \left(0.937 \times Frg\right) - 14.417$$ Where $Ptn$ is the number of punctuation marks and $Frg$ the number of foreign words.

Wrapper function: TRI

"Tuldava":

Tuldava's Text Difficulty Formula. Supposed to be rather independent of specific languages (Grzybek, 2010). $$TD = \frac{Sy}{W} \times ln\left( \frac{W}{St} \right)$$

Wrapper function: tuldava

"Wheeler.Smith":

Intended for english texts in primary grades 1--4 (Wheeler & Smith, 1954): $$WS = \frac{W}{St} \times \frac{10 \times{} W_{2Sy}}{W}$$ If parameters is set to Wheeler.Smith="de", the calculation stays the same, but grade placement is done according to Bamberger & Vanecek (1984), that is for german texts.

Wrapper function: wheeler.smith

By default, if the text has to be tagged yet, the language definition is queried by calling get.kRp.env(lang=TRUE) internally. Or, if txt has already been tagged, by default the language definition of that tagged object is read and used. Set force.lang=get.kRp.env(lang=TRUE) or to any other valid value, if you want to forcibly overwrite this default behaviour, and only then. See kRp.POS.tags for all supported languages.

References

Anderson, J. (1981). Analysing the readability of english and non-english texts in the classroom with Lix. In Annual Meeting of the Australian Reading Association, Darwin, Australia.

Anderson, J. (1983). Lix and Rix: Variations on a little-known readability index. Journal of Reading, 26(6), 490--496.

Bamberger, R. & Vanecek, E. (1984). Lesen--Verstehen--Lernen--Schreiben. Wien: Jugend und Volk.

Briest, W. (1974). Kann man Verst<U+00E4>ndlichkeit messen? Zeitschrift f<U+00FC>r Phonetik, Sprachwissenschaft und Kommunikationsforschung, 27, 543--563.

Coleman, M. & Liau, T.L. (1975). A computer readability formula designed for machine scoring, Journal of Applied Psychology, 60(2), 283--284.

Dickes, P. & Steiwer, L. (1977). Ausarbeitung von Lesbarkeitsformeln f<U+00FC>r die deutsche Sprache. Zeitschrift f<U+00FC>r Entwicklungspsychologie und P<U+00E4>dagogische Psychologie, 9(1), 20--28.

DuBay, W.H. (2004). The Principles of Readability. Costa Mesa: Impact Information. WWW: http://www.impact-information.com/impactinfo/readability02.pdf; 22.03.2011.

Farr, J.N., Jenkins, J.J. & Paterson, D.G. (1951). Simplification of Flesch Reading Ease formula. Journal of Applied Psychology, 35(5), 333--337.

Fern<U+00E1>ndez, A. M. (2016, November 30). F<U+00F3>rmula de comprensibilidad de Guti<U+00E9>rrez de Polini. https://legible.es/blog/comprensibilidad-gutierrez-de-polini/

Flesch, R. (1948). A new readability yardstick. Journal of Applied Psychology, 32(3), 221--233.

Grzybek, P. (2010). Text difficulty and the Arens-Altmann law. In Peter Grzybek, Emmerich Kelih, J<U+00E1>n Ma<U+010D>utek (Eds.), Text and Language. Structures -- Functions -- Interrelations. Quantitative Perspectives. Wien: Praesens, 57--70.

Harris, A.J. & Jacobson, M.D. (1974). Revised Harris-Jacobson readability formulas. In 18th Annual Meeting of the College Reading Association, Bethesda.

Klare, G.R. (1975). Assessing readability. Reading Research Quarterly, 10(1), 62--102.

McLaughlin, G.H. (1969). SMOG grading -- A new readability formula. Journal of Reading, 12(8), 639--646.

Powers, R.D, Sumner, W.A, & Kearl, B.E. (1958). A recalculation of four adult readability formulas, Journal of Educational Psychology, 49(2), 99--105.

Smith, E.A. & Senter, R.J. (1967). Automated readability index. AMRL-TR-66-22. Wright-Paterson AFB, Ohio: Aerospace Medical Division.

Spache, G. (1953). A new readability formula for primary-grade reading materials. The Elementary School Journal, 53, 410--413.

Tr<U+00E4>nkle, U. & Bailer, H. (1984). Kreuzvalidierung und Neuberechnung von Lesbarkeitsformeln f<U+00FC>r die deutsche Sprache. Zeitschrift f<U+00FC>r Entwicklungspsychologie und P<U+00E4>dagogische Psychologie, 16(3), 231--244.

Wheeler, L.R. & Smith, E.H. (1954). A practical readability formula for the classroom teacher in the primary grades. Elementary English, 31, 397--399.

[1] https://strainindex.wordpress.com/2007/09/25/hello-world/

Examples

Run this code

# NOT RUN {
# code is only run when the english language package can be loaded
if(require("koRpus.lang.en", quietly = TRUE)){
  sample_file <- file.path(
    path.package("koRpus"), "examples", "corpus", "Reality_Winner.txt"
  )
  # call readability() on a tokenized text
  tokenized.obj <- tokenize(
    txt=sample_file,
    lang="en"
  )
  # if you call readability() without arguments,
  # you will get its results directly
  rdb.results <- readability(tokenized.obj)

  # there are [ and [[ methods for these objects
  rdb.results[["ARI"]]

  # alternatively, you can also store those results as a
  # feature in the object itself
  tokenized.obj <- readability(
    tokenized.obj,
    as.feature=TRUE
  )
  # results are now part of the object
  hasFeature(tokenized.obj)
  corpusReadability(tokenized.obj)
} else {}
# }

Run the code above in your browser using DataLab