Learn R Programming

MazamaCoreUtils (version 0.6.2)

html_getTables: Extract tables from an HTML page

Description

Parse an HTML page and return all <table> elements as a list of data frames.

Usage

html_getTables(url = NULL, header = NA)

html_getTable(url = NULL, header = NA, index = 1)

Value

List of data frames, one for each HTML table.

A single data frame containing the requested HTML table.

Arguments

url

URL or local file path of an HTML page.

header

Logical specifying whether the first row should be used as column names. If NA, the first row is used only when it contains <th> elements.

index

Index identifying which table to return.

Details

The url argument may be either a remote URL or a local file path. Tables are parsed with rvest::html_table(). To extract a single table, use html_getTable().

Examples

Run this code
if (FALSE) {
url <- "https://en.wikipedia.org/wiki/List_of_tz_database_time_zones"

tables <- html_getTables(url)
firstTable <- tables[[1]]

head(firstTable)
nrow(firstTable)
}

Run the code above in your browser using DataLab