rtrek (version 0.2.0)

memory_beta: Memory Beta API

Description

Access Star Trek content from Memory Beta (http://memory-beta.wikia.com).

Usage

memory_beta(endpoint)

Arguments

endpoint

character, See details.

Value

a data frame

Details

The content returned is always a data frame. The structure changes slightly depending on the nature of the endpoint, but results from different endpoints can be merged easily.

Portals

At the highest level, passing enpoind = "portals" returns a data frame listing the available Memory Beta portals supported by rtrek. A column of relative URLs is also included for reference, but can be ignored. Compared to Memory Alpha, Memory Beta does not technically offer "portals", but for consistency in rtrek, several high level categories on Memory Beta are treated as portal options. See memory_alpha for comparison.

Also, Memory Beta has a very similar site structure to Memory Alpha. This makes the code that interfaces with both very similar. Memory Beta also offers a simpler, more consistent site structure than Memory Alpha, leading to fewer (handled and unhandled) content edge cases.

Portal Categories

In all other cases, the endpoint string must begin with one of the valid portal IDs. Passing only the ID returns a data frame with IDs and relative URLs associated with the available categories in the specific portal. Unlike memory_alpha, there are no group or subgroup columns. Memory Beta offers a more consistent reliance on the simple hierarchy of categories and articles.

Selecting a specific category within a portal is done by appending the portal ID in endpoint with the category ID, separated by a forward slash. You can append nested subcategory IDs with forward slashes, provided they subcategories exist.

Articles

When the endpoint is neither a top-level portal or one of a portal's categories (or subcategories, if available), it is an article. An article is a terminal node, meaning you cannot nest further. An article will be any entry whose URL does not begin with Category:. In this case, the content returned is still a data frame for consistency, but differs substantially from the results of non-terminal endpoints.

Memory Beta is not a database containing convenient tables. Articles comprise the bulk of what Memory Beta has to offer. They are not completely unstructured text, but are loosely structured. Some assumptions are made and memory_beta returns a data frame containing article text and links. It is up to the user what to do with this information, e.g., performing text analyses.

Additional notes

The url column included in results for context uses relative paths to save space. The full URLs all begin the same. To visit a URL directly, prepend it with http://memory-beta.wikia.com/wiki/.

Also note that once you know the relative URL for an article, e.g., "Worf,_son_of_Mogh", you do not need to traverse through one of the portals using an endpoint string to retrieve its content. You can instead use mb_article("Worf,_son_of_Mogh").

memory_beta provides an overview perspective on how content available at Memory Beta is organized and can be searched for through a variety of hierarchical layouts. And in some cases this structure that can be obtained in table form can be useful as data or metadata in itself. Alternatively, mb_article is focused exclusively on pulling back content from known articles.

See Also

mb_article, memory_alpha

Examples

Run this code
# NOT RUN {
memory_beta("portals") # show available portals
endpoint <- "characters/Characters by races and cultures/Klingonoids/Klingons"
if(has_internet()){
  x <- memory_beta(endpoint)
  x <- x[grep("Worf", x$Klingons), ]
  x
  memory_beta(paste0(endpoint, "/Worf, son of Mogh")) # return terminal article content
}
# }

Run the code above in your browser using DataLab