How to use gargle for auth in a client package

knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )

gargle provides common infrastructure for use with Google APIs. This vignette describes one possible design for using gargle to deal with auth, in a client package that provides a high-level wrapper for a specific API.

There are frequent references to (the development version of) googledrive, which is a functioning test bed. The released version of googledrive already presents a very similar interface to users, but using internal functions. The release of gargle means these functions now exist in gargle, where they can be shared across packages, and the redundant versions in googledrive can be removed. The auth approach described here is already used in the (GitHub-only) googlesheets4 package (the successor to googlesheets). Packages like bigrquery and gmailr are slated for retrofitting (these obviously require a thoughtful consideration of backwards compatibility).

Key choices

Getting a token requires several pieces of information and there are stark differences in how much users (need to) know or control about this process. Let's review them, with an eye towards identifying the responsibilities of the package author versus the user.

  • Overall config: OAuth app and API key. Who provides?
  • Token-level properties: Google identity (email) and scopes.
  • Request-level: Who manages tokens and injects them into requests?

User-facing auth

In googledrive, the main user-facing auth function is googledrive::drive_auth(). Here is its definition (at least approximately, remember this is static code):

# googledrive:: drive_auth <- function(email = NULL, path = NULL, scopes = "https://www.googleapis.com/auth/drive", cache = gargle::gargle_oauth_cache(), use_oob = gargle::gargle_oob_default()) { cred <- gargle::token_fetch( scopes = scopes, app = drive_oauth_app(), email = email, path = path, package = "googledrive", cache = cache, use_oob = use_oob ) if (!inherits(cred, "Token2.0")) { # throw an informative error here } .auth$set_cred(cred) .auth$set_auth_active(TRUE) invisible() }

drive_auth() is called automatically upon the first need of a token and that can lead to user interaction, but does not necessarily do so. token_fetch() is described in the vignette How gargle gets tokens. The internal .auth object maintains googledrive's auth state and is explained next.

Auth state

A client package can use an internal object of class gargle::AuthClass to hold the auth state. Here's how it is initialized in googledrive:

.auth <- gargle::init_AuthState( package = "googledrive", app = gargle::tidyverse_app(), # YOUR PKG SHOULD USE ITS OWN APP! api_key = gargle::tidyverse_api_key(), # YOUR PKG SHOULD USE ITS OWN KEY! auth_active = TRUE )

OAuth app

Most users should present OAuth user credentials to Google APIs. However, most users can also be spared the fiddly details surrounding this. The OAuth app is one example. The app is a component that most users do not even know about and they are content to use the same app for all work through a client package: probably, the app built into the package.

There is a field in the .auth auth state to hold the OAuth app, which should default to the package's built-in app (which should not be the gargle or tidyverse app). Exported auth helpers, like drive_oauth_app() and drive_auth_config(), retrieve and modify the current app for the minority of users who need that level of control.

library(googledrive) google_app <- httr::oauth_app( "acme-corp", key = "123456789.apps.googleusercontent.com", secret = "abcdefghijklmnopqrstuvwxyz" ) drive_auth_config(app = google_app) drive_oauth_app() #> acme-corp #> key: 123456789.apps.googleusercontent.com #> secret:

API key

Some Google APIs can be used in an unauthenticated state, if and only if requests include an API key. For example, this is a great way to read a Google Sheet that is world-readable or readable by "anyone with a link" from a Shiny app, thereby designing away the need to manage credentials on the server.

If you are wrapping an API with this feature, a default API key can be stored in the api_key field of the .auth auth state. As with the app, packages should obtain their own API key and not borrow the gargle or tidyverse key. And again, exported auth helpers, like drive_api_key() and drive_auth_config(), make the current key inspectable and configurable, for the minority of users who need that level of control.

library(googledrive) drive_auth_config(api_key = "123456789") drive_api_key() #> "123456789"

A good rule of thumb is that it's OK to build in an API key if all the key does is allow users to do things via API that they would also be able to do in the browser, even without being logged in. In this case, Google uses the key to impose quotas and rate limits. If the API key has other implications, for example if it is used for billing purposes, then clearly the client package cannot include a key and should, instead, support the user's provision of a key.

Email or Google identity

In contrast to the OAuth app and API key, every user must express which identity they wish to present to the API. This is a familiar concept and users expect to specify this. Since users may have more than one Google account, it's quite likely that they will want to switch between accounts, even within a single R session, or that they might want to explicitly declare the identity to be used in a specific script or app.

That explains why drive_auth() has the optional email argument that lets users proactively specify their identity. drive_auth() is usually called indirectly upon first need, but a user can also call it proactively in order to specify their target email:

# googledrive:: drive_auth(email = "janedoe_work@gmail.com")

If email is not given, gargle also checks for an option named "gargle_oauth_email". The email is used to look up tokens in the cache and, if no suitable token is found, it is used to pre-configure the OAuth chooser in the browser. Read more in the help for gargle::gargle_oauth_email().

Scopes

Most users have no concept of scopes. They just know they want to work with, e.g., Google Drive or Google Sheets. A client package can usually pick sensible default scopes, that will support what most users want to do.

Here's a reminder of the signature of googledrive::drive_auth():

# googledrive:: drive_auth <- function(email = NULL, path = NULL, scopes = "https://www.googleapis.com/auth/drive", cache = gargle::gargle_oauth_cache(), use_oob = gargle::gargle_oob_default()) { ... }

googledrive ships with a default scope, but a motivated user could call drive_auth() pre-emptively at the start of the session and request different scopes. For example, if they intend to only read data and want to guard against inadvertent file modification, they might opt for the drive.readonly scope.

# googledrive:: drive_auth(scopes = "https://www.googleapis.com/auth/drive.readonly")

OAuth cache and Out-of-bounds auth

These are two aspects of OAuth where most users are content to go along with sensible default behaviour. For those who want to exert control, that can be done in direct calls to drive_auth() or by configuring an option. Read the help for gargle::gargle_oauth_cache() and gargle::gargle_oob_default() for more about these options.

Overview of mechanics

Here's a concrete outline of how one could set up a client package to get its auth functionality from gargle.

  1. Add gargle to your package's Imports.
  2. Create a file R/YOUR_PKG_auth.R.
  3. Create an internal gargle::AuthClass object to hold auth state. R/YOUR_PKG_auth.R is a good place to do this.
  4. Define standard functions for the auth interface between gargle and your package; do this in R/YOURPKG_auth.R. Example: tidyverse/googledrive/R/drive_auth.R.
  5. Use gargle's roxygen helpers to create the docs for your auth functions. This relieves you from writing docs and you inherit standard wording. See tidyverse/googledrive/R/drive_auth.R for a demonstration.
  6. Use the functions YOURPKG_api_key() and YOURPKG_token() (defined in the standard auth interface) to insert an API key or token in your package's requests.

Getting that first token

I focus on early use, by the naive user, with the OAuth flow. When the user first calls a high-level googledrive function such as drive_find(), a Drive request is ultimately generated with a call to googledrive::request_generate(). Here is its definition, at least approximately:

# googledrive:: request_generate <- function(endpoint = character(), params = list(), key = NULL, token = drive_token()) { ept <- .endpoints[[endpoint]] if (is.null(ept)) { stop_glue("\nEndpoint not recognized:\n * {endpoint}") } ## modifications specific to googledrive package params$key <- key %||% params$key %||% drive_api_key() if (!is.null(ept$parameters$supportsTeamDrives)) { params$supportsTeamDrives <- TRUE } req <- gargle::request_develop(endpoint = ept, params = params) gargle::request_build( path = req$path, method = req$method, params = req$params, body = req$body, token = token ) }

googledrive::request_generate() is a thin wrapper around gargle::request_develop() and gargle::request_build() that only implements details specific to googledrive, before delegating to more general functions in gargle. The vignette Request Helper Functions documents these gargle functions.

googledrive::request_generate() gets a token with drive_token(), which is defined like so:

# googledrive:: drive_token <- function() { if (isFALSE(.auth$auth_active)) { return(NULL) } if (!have_token()) { drive_auth() } httr::config(token = .auth$cred) }

where have_token() in an internal helper defined as:

# googledrive::: have_token <- function() { inherits(.auth$cred, "Token2.0") }

By default, auth is active, and, for a fresh start, we won't have a token stashed in .auth yet. So this will result in a call to drive_auth() to obtain a credential, which is then cached in .auth$cred for the remainder of the session. All subsequent calls to drive_token() will just spit back this token.

Above, we discussed scenarios where an advanced user might call drive_auth() proactively, with non-default arguments, possibly even loading a service token or using alternative flows, like Application Default Credentials or a Google Cloud Engine flow. Any token loaded in that way is stashed in .auth$cred and will be returned by subsequent calls to drive_token().

Auth interface

The exported functions like drive_auth(), drive_token(), etc. constitute the auth interface between googledrive and gargle and are centralized in tidyverse/googledrive/R/drive_auth.R. That is a good template for how to use gargle to manage auth in a client package. In addition, the docs for these gargle-backed functions are generated automatically from standard information maintained in the gargle package.

  • drive_token() retrieves the current credential, in a form that is ready for inclusion in HTTP requests. If auth_active is TRUE and cred is NULL, drive_auth() is called to obtain a credential. If auth_active is FALSE, NULL is returned; client packages should be designed to fall back to including an API key in affected HTTP requests.
  • drive_auth() ensures we are dealing with an authenticated user and have a credential on hand with which to place authorized requests. Sets auth_active to TRUE. Can be called directly, but drive_token() will call when/as needed.
  • drive_deauth() sets auth_active to FALSE.
  • drive_oauth_app() returns .auth$app.
  • drive_api_key() returns .auth$key.
  • drive_auth_config() can be used to query and set auth config. This is how an advanced user would enter their own OAuth app and API key into auth config, in order to affect all subsequent requests.

De-activating auth

drive_deauth() can be used at any time to enter a de-authorized state, during which requests are sent out with an API key and no token. This is a great way to eliminate any friction re: auth if there's no need for it, i.e. if all requests are for resources that are world readable or available to anyone who knows how to ask for it, such as files shared via "Anyone with the link". The de-authorized state is especially useful in non-interactive settings or where user interaction is indirect, such as via Shiny.

BYOAK = Bring Your Own App and Key

Advanced users can use their own OAuth app and API key. drive_auth_config() lives in R/drive_auth() and it provides the ability to see or modify the current app and api_key. Recall that drive_oauth_app() and drive_api_key() also exist for targeted, read-only access.

Changing identities (and more)

One reason for a user to call drive_auth() directly and proactively is to switch from one Google identity to another or to make sure they are presenting themselves with a specific identity. drive_auth() accepts an email argument, which is honored when gargle determines if there is already a suitable token on hand. Here is a sketch of how a user could switch identities during a session, possibly non-interactive:

library(googledrive) drive_auth(email = "janedoe_work@gmail.com") # do stuff with Google Drive here, with Jane Doe's "work" account drive_auth(email = "janedoe_personal@gmail.com") # do other stuff with Google Drive here, with Jane Doe's "personal" account drive_auth(path = "/path/to/a/service-account.json") # do other stuff with Google Drive here, using a service account