This helper function automatically retrieve the full data set of speeches made available by the ECB. In addition, it implements a number of pre-processing steps that may be turned on or off as needed.
get_ECB_speeches(
filter_english = TRUE,
clean_footnotes = TRUE,
compute_sentiment = TRUE,
tokenize_w_POS = FALSE
)
Depending on the arguments, returns either a data.frame or a quanteda::tokens object containing speeches of the ECB.
if TRUE
, attempts to select English speeches only
using textcat::textcat()
.
if TRUE
, attempts to clean footnotes from speeches
texts using some regex patterns.
if TRUE
, computes the sentiment of each speech
using sentometrics::compute_sentiment()
with the the Loughran & McDonald
lexicon.
if TRUE
, tokenizes and apply Part-Of-Speech tagging
with spacyr::spacy_parse()
. Nouns, adjectives and proper nouns are then
extracted from the parsed speeches to form a tokens
object.