This function extracts text from PDF documents and returns the text as a string,
as a list of lines and as a list of words. It uses 'pdftools' to extract the
content from textual PDF files and 'tesseract' to extract the content from
image-based PDF-files.
Usage
extractText(file)
Value
List including the extracted text, a data table including the lines, a data table including the words, the type and language of the document.