lnt_read

Name or names of LexisNexis TXT file to be converted.

Encoding to be assumed for input files. Defaults to UTF-8
(the LexisNexis standard value).

encoding

A logical flag indicating if the returned object
will include a third data frame with paragraphs.

extract_paragraphs

A logical flag indicating if it should be tried to
convert the date of each article into Date format. For non-standard dates
provided by LexisNexis it might be safer to convert dates afterwards (see
<a rd-options="" href="/link/lnt_asDate?package=LexisNexisTools&version=0.2.2" data-mini-rdoc="LexisNexisTools::lnt_asDate">lnt_asDate</a>).

convert_date

Is used to indicate the beginning of an article. All
articles should have the same number of Beginnings, ends and lengths (which
indicate the last line of metadata). Use regex expression such as "\d+ of
\d+ DOCUMENTS$" (which would catch e.g., the format "2 of 100 DOCUMENTS")
or "auto" to try all common keywords. Keyword search is case sensitive.

start_keyword

Is used to indicate the end of an article. Works the same
way as start_keyword. A common regex would be "^LANGUAGE: " which catches
language in all caps at the beginning of the line (usually the last line of
an article).

end_keyword

Is used to indicate the end of the metadata. Works the
same way as start_keyword and end_keyword. A common regex would be
"^LENGTH: " which catches length in all caps at the beginning of the line
(usually the last line of the metadata).

length_keyword

Lines in which these keywords are found are excluded.
Set to <code>character()</code> if you want to turn off this feature.

exclude_lines

A logical flag indicating whether subdirectories are
searched for more TXT files.

recursive

A logical flag indicating whether information should be
printed to the screen.

verbose

Additional arguments passed on to <a rd-options="" href="/link/lnt_asDate?package=LexisNexisTools&version=0.2.2" data-mini-rdoc="LexisNexisTools::lnt_asDate">lnt_asDate</a>.

Read a LexisNexis TXT file and convert it to a object of class
<a rd-options="" href="/link/LNToutput?package=LexisNexisTools&version=0.2.2" data-mini-rdoc="LexisNexisTools::LNToutput">LNToutput</a>.

My PhD supervisor once told me that everyone doing newspaper
analysis starts by writing code to read in files from the 'LexisNexis' newspaper
archive (retrieved e.g., from <http://www.nexis.com/> or any of the partner
sites). However, while this is a nice exercise I do recommend, not everyone has
the time. This package takes TXT files downloaded from the newspaper archive of
'LexisNexis', reads them into R and offers functions for further processing.

Johannes Gruber

LexisNexisTools

Working with Files from 'LexisNexis'

lnt_read function

A logical flag indicating if it should be tried to
convert the date of each article into Date format. For non-standard dates
provided by LexisNexis it might be safer to convert dates afterwards (see
<a rd-options='' href='lnt_asDate'>lnt_asDate</a>).

Additional arguments passed on to <a rd-options='' href='lnt_asDate'>lnt_asDate</a>.

Read a LexisNexis TXT file and convert it to a object of class
<a rd-options='' href='LNToutput'>LNToutput</a>.

lnt_read: Read in a LexisNexis TXT file

Description

Usage

Arguments

Value

Details

Examples