check_spelling: Spell checking

Description

Spell checking

Usage

check_spelling(filename, pre_release = TRUE, ignore.lines = NULL,
  known.correct = NULL, known.correct.fixed = NULL,
  known.wrong = NULL, ignore_spelling_in = NULL,
  ignore_spelling_in_nth = NULL, bib_files, check_etcs = TRUE,
  dict_lang = "en_GB", rstudio = FALSE, .report_error)

Arguments

filename

Path to a LaTeX file to check.

pre_release

Should the document be assumed to be final? Setting to FALSE permits the use of ignore_spelling_in and permits add_to_dictionary to be present outside the document preamble.

ignore.lines

Integer vector of lines to ignore (due to possibly spurious errors).

known.correct

Character vector of patterns known to be correct (which will never be raised by this function).

known.correct.fixed

Character vector of words known to be correct (which will never be raised by this function).

known.wrong

Character vector of patterns known to be wrong.

ignore_spelling_in

Command whose first mandatory argument will be ignored.

ignore_spelling_in_nth

Named list of arguments to ignore; names are the commands to be ignored, values are the nth argument to be ignored.

bib_files

Bibliography files (containing possible clues to misspellings). If supplied, and this function would otherwise throw an error, the .bib files are read and any author names that match the misspelled words are added to the dictionary.

check_etcs

If TRUE, stop if any variations of etc, ie, and eg are present. (If they are typed literally, they may be formatted inconsistently. Using a macro ensures they appear consistently.)

dict_lang

Passed to hunspell::dictionary.

rstudio

Use the RStudio API?

.report_error

A function to provide context to any errors. If missing, defaults to report2console.

Value

Called primarily for its side-effect. If the spell check fails, the line at which the first error was detected, with an error message. If the check succeeds, NULL invisibly.

Details

Extends and enhances hunspell:

You can add directives in the document itself. To add a word foobaz to the dictionary (so its presence does not throw an error), write % add_to_dictionary: foobaz on a single line. The advantage of this method is that you can collaborate on the document without having to keep track of which spelling errors are genuine.
The directive % ignore_spelling_in: mycmd which will ignore the spelling of words within the first argument of \mycmd.
ignore_spelling_in_file: <file.tex> will skip the check of <file.tex> if it is input or include in filename, as well as any files within it. Should appear as it is within input but with the file extension
Only the root document need be supplied; any files that are fed via \input or \include are checked (recursively).
A historical advantages was that the contents of certain commands were not checked, the spelling of which need not be checked as they are not printed, viz. citation and cross-reference commands, and certain optional arguments. Most of these are now parsed correctly by hunspell, though some still need to be supplied (including, naturally, user-supplied macros).
Abbreviations and initialisms which are validly introduced will not throw errors. See extract_valid_abbrevations.
Words preceded by '[sic]' will not throw errors.

The package comes with a suite of correctly_spelled_words that were not present in hunspell's dictionary.

This function should be quite fast, but slower than hunspell::hunspell (which it invokes). I aim for less than 500 ms on a real-world report of around 100 pages. The function is slower when it needs to consult bib_files, though I recommend adding authors, titles, etc. to the dictionary explicitly, or using citeauthor and friends.

This function is forked from https://github.com/hughparsonage/grattanReporter to parse reports of the Grattan Institute, Melbourne for errors. See https://github.com/HughParsonage/grattex/blob/master/doc/grattexDocumentation.pdf for the full spec. Some checks that package performs have been omitted in this package.

Examples

Run this code

# NOT RUN {
# }
# NOT RUN {
url_bib <- 
paste0("https://raw.githubusercontent.com/HughParsonage/",
       "grattex/e6cab97145d38890e44e83d122e995e3b8936fc6/",
       "Report.tex")
check_spelling(url_bib)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab