get_everything: Get resources of newsapi.org

Description

get_everything returns articles from large and small news sources and blogs. This includes news as well as other regular articles. You can search for multiple sources, different language, or use your own keywords. Articles can be sorted by the earliest date publishedAt, relevancy, or popularity. To automatically download all results, use get_everything_all(). Please check that the api_key is available. You can provide an explicit definition of the key or use set_api_key(). Valid languages for language are provided in the dataset terms_language.

Usage

get_everything(query, sources = NULL, domains = NULL,
  exclude_domains = NULL, from = NULL, to = NULL, language = NULL,
  sort_by = "publishedAt", page = 1, page_size = 100,
  api_key = Sys.getenv("NEWS_API_KEY"))

Arguments

query

Character string that contains the searchterm for the API's data base. API supports advanced search parameters, see 'details'. Passing a searchterm is compulsory.

sources

Character vector with with IDs of the news outlets you want to focus on (e.g., c("usa-today", "spiegel-online")).

domains

Character vector with domains that you want to restrict your search to (e.g. c("bbc.com", "nytimes.com")).

exclude_domains

Similar usage as with 'domains'. Will exclude these domains from your search.

from

Character string with start date of your search. Needs to conform to one of the following lubridate order strings: "ymdHMs, ymdHMsz, ymd". See help for lubridate::parse_date_time. If from is not specified, NewsAPI defaults to the oldest available date (depends on your paid/unpaid plan from newsapi.org).

Character string that marks the end date of your search. Needs to conform to one of the following lubridate order strings: "ymdHMs, ymdHMsz, ymd". See help for lubridate::parse_date_time. If to is not specified, NewsAPI defaults to the most recent article available.

language

Specifies the language of the articles of your search. Must be in ISO shortcut format (e.g., "de", "en"). See list of all languages using newsanchor::terms_language. Default is all languages.

sort_by

Character string that specifies the sorting variable of your article results. Accepts three options: "publishedAt", "relevancy", "popularity". Default is "publishedAt".

page

Specifies the page number of your results that is returned. Must be numeric. Default is first page. If you want to get all results at once, use get_everything_all from 'newsanchor'.

page_size

The number of articles per page that are returned. Maximum is 100 (also default).

api_key

Character string with the API key you get from newsapi.org. Passing it is compulsory. Alternatively, function can be provided from the global environment (see set_api_key()).

Value

List with two dataframes: 1) Data frame with results_df 2) Data frame with meta_data

Details

Advanced search (see also www.newsapi.org): Surround entire phrases with quotes (") for exact matches. Prepend words/phrases that must appear with "+" symbol (e.g., +bitcoin). Prepend words that must not appear with "-" symbol (e.g., -bitcoin). You can also use AND, OR, NOT keywords (optionally grouped with parenthesis, e.g., 'crypto AND (ethereum OR litecoin) NOT bitcoin)').

Examples

Run this code

# NOT RUN {
df <- get_everything(query = "stuttgart", language = "de")
df <- get_everything(query = "mannheim", from = "2019-01-02 12:00:00")
# }

Run the code above in your browser using DataLab