googlesheets (version 0.3.0)

gs_read_listfeed: Read data via the "list feed"

Description

Gets data via the "list feed", which assumes populated cells form a neat rectangle. The list feed consumes data row by row. The first row is assumed to hold variable or column names; it can be empty. The second row is assumed to hold the first data row and, if it is empty, no data will be read and you will get an empty data frame.

Usage

gs_read_listfeed(ss, ws = 1, reverse = NULL, orderby = NULL, sq = NULL,
  ..., verbose = TRUE)

Arguments

ss

a registered Google spreadsheet, i.e. a googlesheet object

ws

positive integer or character string specifying index or title, respectively, of the worksheet

reverse

logical, optional. Indicates whether to request reverse row order in the actual API call.

orderby

character, optional. Specifies a column to sort on in the actual API call.

sq

character, optional. Provides a structured query for row filtering in the actual API call.

...

Optional arguments to control data download, parsing, and reshaping; for most purposes, the defaults should be fine. Anything that is not listed here will be silently ignored.

progress

Logical. Whether to display download progress if in an interactive session.

col_types

Seize control of type conversion for variables. Passed straight through to readr::read_csv or readr::type_convert. Follow those links or read the vignette("column-types") for details.

locale, trim_ws, na

Specify locale, the fate of leading or trailing whitespace, or a character vector of strings that should become missing values. Passed straight through to readr::read_csv or readr::type_convert.

comment, skip, n_max

Specify a string used to identify comments, request to skip lines before reading data, or specify the maximum number of data rows to read.

col_names

Either TRUE, FALSE or a character vector of column names. If TRUE, the first row of the data rectangle will be used for names. If FALSE, column names will be X1, X2, etc. If a character vector, it will be used as column names. If the sheet contains column names and you just don't like them, specify skip = 1 so they don't show up in your data.

check.names

Logical. Whether to run column names through make.names with unique = TRUE, just like read.table does. By default, googlesheets implements the readr data ingest philosophy, which leaves column names "as is", with one exception: data frames returned by googlesheets will have a name for each variable, even if we have to create one.

verbose

logical; do you want informative messages?

Value

a data.frame or, if dplyr is loaded, a tbl_df

Column names

For the list feed, and only for the list feed, the Sheets API wants to transform the variable or column names like so: 'The column names are the header values of the worksheet lowercased and with all non-alpha-numeric characters removed. For example, if the cell A1 contains the value "Time 2 Eat!" the column name would be "time2eat".' In googlesheets, we do not let this happen and, instead, use the column names "as is", for consistent output across all gs_read* functions. If you direct gs_read_listfeed to pass query parameters to the actual API call, you must refer to variables using the column names under this API-enforced transformation. For example, to order the data by the column with "Time 2 Eat!" in the header row, you must specify orderby = "time2eat" in the gs_read_listfeed call.

Sorting and filtering via the API

Why on earth would you want to sort and filter via the API instead of in R? Just because you can? It is conceivable there are situations, such as a large spreadsheet, in which it is faster to sort or filter via API. Be sure to refer to variables using the API-transformed column names explained above! It is a known bug that reverse=true alone will NOT, in fact, reverse the row order of the result. In our experience, the reverse query parameter will only have effect in combination with explicit specification of a column to sort on via orderby. The syntax for these queries is apparently undocumented, so keep it simple or bring your spirit of adventure!

Details

The other read functions are generally superior, so use them if you can. However, you may need to use this function if you are dealing with an "old" Google Sheet, which is beyond the reach of gs_read_csv. The list feed also has some ability to sort and filter rows via the API (more below). Consult the Google Sheets API documentation for more details about the list feed.

See Also

Other data consumption functions: gs_read_cellfeed, gs_read_csv, gs_read, gs_reshape_cellfeed, gs_simplify_cellfeed

Examples

Run this code
# NOT RUN {
gap_ss <- gs_gap() # register the Gapminder example sheet
oceania_lf <- gs_read_listfeed(gap_ss, ws = "Oceania")
head(oceania_lf, 3)

## do row ordering and filtering in the API call
oceania_fancy <-
  gs_read_listfeed(gap_ss,
                   ws = "Oceania",
                   reverse = TRUE, orderby = "gdppercap",
                   sq = "lifeexp > 79 or year < 1960")
oceania_fancy

## passing args through to readr::type_convert()
oceania_crazy <-
  gs_read_listfeed(gap_ss,
                   ws = "Oceania",
                   col_names = paste0("z", 1:6), skip = 1,
                   col_types = "ccncnn",
                   na = "1962")
oceania_crazy
# }
# NOT RUN {
# }

Run the code above in your browser using DataCamp Workspace