paws.analytics (version 0.5.0)

glue: AWS Glue

Description

Glue

Defines the public endpoint for the Glue service.

Usage

glue(config = list(), credentials = list(), endpoint = NULL, region = NULL)

Value

A client for the service. You can call the service's operations using syntax like svc$operation(...), where svc is the name you've assigned to the client. The available operations are listed in the Operations section.

Arguments

config

Optional configuration of credentials, endpoint, and/or region.

  • credentials:

    • creds:

      • access_key_id: AWS access key ID

      • secret_access_key: AWS secret access key

      • session_token: AWS temporary session token

    • profile: The name of a profile to use. If not given, then the default profile is used.

    • anonymous: Set anonymous credentials.

  • endpoint: The complete URL to use for the constructed client.

  • region: The AWS Region used in instantiating the client.

  • close_connection: Immediately close all HTTP connections.

  • timeout: The time in seconds till a timeout exception is thrown when attempting to make a connection. The default is 60 seconds.

  • s3_force_path_style: Set this to true to force the request to use path-style addressing, i.e. http://s3.amazonaws.com/BUCKET/KEY.

  • sts_regional_endpoint: Set sts regional endpoint resolver to regional or legacy https://docs.aws.amazon.com/sdkref/latest/guide/feature-sts-regionalized-endpoints.html

credentials

Optional credentials shorthand for the config parameter

  • creds:

    • access_key_id: AWS access key ID

    • secret_access_key: AWS secret access key

    • session_token: AWS temporary session token

  • profile: The name of a profile to use. If not given, then the default profile is used.

  • anonymous: Set anonymous credentials.

endpoint

Optional shorthand for complete URL to use for the constructed client.

region

Optional shorthand for AWS Region used in instantiating the client.

Service syntax

svc <- glue(
  config = list(
    credentials = list(
      creds = list(
        access_key_id = "string",
        secret_access_key = "string",
        session_token = "string"
      ),
      profile = "string",
      anonymous = "logical"
    ),
    endpoint = "string",
    region = "string",
    close_connection = "logical",
    timeout = "numeric",
    s3_force_path_style = "logical",
    sts_regional_endpoint = "string"
  ),
  credentials = list(
    creds = list(
      access_key_id = "string",
      secret_access_key = "string",
      session_token = "string"
    ),
    profile = "string",
    anonymous = "logical"
  ),
  endpoint = "string",
  region = "string"
)

Operations

batch_create_partitionCreates one or more partitions in a batch operation
batch_delete_connectionDeletes a list of connection definitions from the Data Catalog
batch_delete_partitionDeletes one or more partitions in a batch operation
batch_delete_tableDeletes multiple tables at once
batch_delete_table_versionDeletes a specified batch of versions of a table
batch_get_blueprintsRetrieves information about a list of blueprints
batch_get_crawlersReturns a list of resource metadata for a given list of crawler names
batch_get_custom_entity_typesRetrieves the details for the custom patterns specified by a list of names
batch_get_data_quality_resultRetrieves a list of data quality results for the specified result IDs
batch_get_dev_endpointsReturns a list of resource metadata for a given list of development endpoint names
batch_get_jobsReturns a list of resource metadata for a given list of job names
batch_get_partitionRetrieves partitions in a batch request
batch_get_table_optimizerReturns the configuration for the specified table optimizers
batch_get_triggersReturns a list of resource metadata for a given list of trigger names
batch_get_workflowsReturns a list of resource metadata for a given list of workflow names
batch_stop_job_runStops one or more job runs for a specified job definition
batch_update_partitionUpdates one or more partitions in a batch operation
cancel_data_quality_rule_recommendation_runCancels the specified recommendation run that was being used to generate rules
cancel_data_quality_ruleset_evaluation_runCancels a run where a ruleset is being evaluated against a data source
cancel_ml_task_runCancels (stops) a task run
cancel_statementCancels the statement
check_schema_version_validityValidates the supplied schema
create_blueprintRegisters a blueprint with Glue
create_classifierCreates a classifier in the user's account
create_connectionCreates a connection definition in the Data Catalog
create_crawlerCreates a new crawler with specified targets, role, configuration, and optional schedule
create_custom_entity_typeCreates a custom pattern that is used to detect sensitive data across the columns and rows of your structured data
create_databaseCreates a new database in a Data Catalog
create_data_quality_rulesetCreates a data quality ruleset with DQDL rules applied to a specified Glue table
create_dev_endpointCreates a new development endpoint
create_jobCreates a new job definition
create_ml_transformCreates an Glue machine learning transform
create_partitionCreates a new partition
create_partition_indexCreates a specified partition index in an existing table
create_registryCreates a new registry which may be used to hold a collection of schemas
create_schemaCreates a new schema set and registers the schema definition
create_scriptTransforms a directed acyclic graph (DAG) into code
create_security_configurationCreates a new security configuration
create_sessionCreates a new session
create_tableCreates a new table definition in the Data Catalog
create_table_optimizerCreates a new table optimizer for a specific function
create_triggerCreates a new trigger
create_user_defined_functionCreates a new function definition in the Data Catalog
create_workflowCreates a new workflow
delete_blueprintDeletes an existing blueprint
delete_classifierRemoves a classifier from the Data Catalog
delete_column_statistics_for_partitionDelete the partition column statistics of a column
delete_column_statistics_for_tableRetrieves table statistics of columns
delete_connectionDeletes a connection from the Data Catalog
delete_crawlerRemoves a specified crawler from the Glue Data Catalog, unless the crawler state is RUNNING
delete_custom_entity_typeDeletes a custom pattern by specifying its name
delete_databaseRemoves a specified database from a Data Catalog
delete_data_quality_rulesetDeletes a data quality ruleset
delete_dev_endpointDeletes a specified development endpoint
delete_jobDeletes a specified job definition
delete_ml_transformDeletes an Glue machine learning transform
delete_partitionDeletes a specified partition
delete_partition_indexDeletes a specified partition index from an existing table
delete_registryDelete the entire registry including schema and all of its versions
delete_resource_policyDeletes a specified policy
delete_schemaDeletes the entire schema set, including the schema set and all of its versions
delete_schema_versionsRemove versions from the specified schema
delete_security_configurationDeletes a specified security configuration
delete_sessionDeletes the session
delete_tableRemoves a table definition from the Data Catalog
delete_table_optimizerDeletes an optimizer and all associated metadata for a table
delete_table_versionDeletes a specified version of a table
delete_triggerDeletes a specified trigger
delete_user_defined_functionDeletes an existing function definition from the Data Catalog
delete_workflowDeletes a workflow
get_blueprintRetrieves the details of a blueprint
get_blueprint_runRetrieves the details of a blueprint run
get_blueprint_runsRetrieves the details of blueprint runs for a specified blueprint
get_catalog_import_statusRetrieves the status of a migration operation
get_classifierRetrieve a classifier by name
get_classifiersLists all classifier objects in the Data Catalog
get_column_statistics_for_partitionRetrieves partition statistics of columns
get_column_statistics_for_tableRetrieves table statistics of columns
get_column_statistics_task_runGet the associated metadata/information for a task run, given a task run ID
get_column_statistics_task_runsRetrieves information about all runs associated with the specified table
get_connectionRetrieves a connection definition from the Data Catalog
get_connectionsRetrieves a list of connection definitions from the Data Catalog
get_crawlerRetrieves metadata for a specified crawler
get_crawler_metricsRetrieves metrics about specified crawlers
get_crawlersRetrieves metadata for all crawlers defined in the customer account
get_custom_entity_typeRetrieves the details of a custom pattern by specifying its name
get_databaseRetrieves the definition of a specified database
get_databasesRetrieves all databases defined in a given Data Catalog
get_data_catalog_encryption_settingsRetrieves the security configuration for a specified catalog
get_dataflow_graphTransforms a Python script into a directed acyclic graph (DAG)
get_data_quality_resultRetrieves the result of a data quality rule evaluation
get_data_quality_rule_recommendation_runGets the specified recommendation run that was used to generate rules
get_data_quality_rulesetReturns an existing ruleset by identifier or name
get_data_quality_ruleset_evaluation_runRetrieves a specific run where a ruleset is evaluated against a data source
get_dev_endpointRetrieves information about a specified development endpoint
get_dev_endpointsRetrieves all the development endpoints in this Amazon Web Services account
get_jobRetrieves an existing job definition
get_job_bookmarkReturns information on a job bookmark entry
get_job_runRetrieves the metadata for a given job run
get_job_runsRetrieves metadata for all runs of a given job definition
get_jobsRetrieves all current job definitions
get_mappingCreates mappings
get_ml_task_runGets details for a specific task run on a machine learning transform
get_ml_task_runsGets a list of runs for a machine learning transform
get_ml_transformGets an Glue machine learning transform artifact and all its corresponding metadata
get_ml_transformsGets a sortable, filterable list of existing Glue machine learning transforms
get_partitionRetrieves information about a specified partition
get_partition_indexesRetrieves the partition indexes associated with a table
get_partitionsRetrieves information about the partitions in a table
get_planGets code to perform a specified mapping
get_registryDescribes the specified registry in detail
get_resource_policiesRetrieves the resource policies set on individual resources by Resource Access Manager during cross-account permission grants
get_resource_policyRetrieves a specified resource policy
get_schemaDescribes the specified schema in detail
get_schema_by_definitionRetrieves a schema by the SchemaDefinition
get_schema_versionGet the specified schema by its unique ID assigned when a version of the schema is created or registered
get_schema_versions_diffFetches the schema version difference in the specified difference type between two stored schema versions in the Schema Registry
get_security_configurationRetrieves a specified security configuration
get_security_configurationsRetrieves a list of all security configurations
get_sessionRetrieves the session
get_statementRetrieves the statement
get_tableRetrieves the Table definition in a Data Catalog for a specified table
get_table_optimizerReturns the configuration of all optimizers associated with a specified table
get_tablesRetrieves the definitions of some or all of the tables in a given Database
get_table_versionRetrieves a specified version of a table
get_table_versionsRetrieves a list of strings that identify available versions of a specified table
get_tagsRetrieves a list of tags associated with a resource
get_triggerRetrieves the definition of a trigger
get_triggersGets all the triggers associated with a job
get_unfiltered_partition_metadataRetrieves partition metadata from the Data Catalog that contains unfiltered metadata
get_unfiltered_partitions_metadataRetrieves partition metadata from the Data Catalog that contains unfiltered metadata
get_unfiltered_table_metadataRetrieves table metadata from the Data Catalog that contains unfiltered metadata
get_user_defined_functionRetrieves a specified function definition from the Data Catalog
get_user_defined_functionsRetrieves multiple function definitions from the Data Catalog
get_workflowRetrieves resource metadata for a workflow
get_workflow_runRetrieves the metadata for a given workflow run
get_workflow_run_propertiesRetrieves the workflow run properties which were set during the run
get_workflow_runsRetrieves metadata for all runs of a given workflow
import_catalog_to_glueImports an existing Amazon Athena Data Catalog to Glue
list_blueprintsLists all the blueprint names in an account
list_column_statistics_task_runsList all task runs for a particular account
list_crawlersRetrieves the names of all crawler resources in this Amazon Web Services account, or the resources with the specified tag
list_crawlsReturns all the crawls of a specified crawler
list_custom_entity_typesLists all the custom patterns that have been created
list_data_quality_resultsReturns all data quality execution results for your account
list_data_quality_rule_recommendation_runsLists the recommendation runs meeting the filter criteria
list_data_quality_ruleset_evaluation_runsLists all the runs meeting the filter criteria, where a ruleset is evaluated against a data source
list_data_quality_rulesetsReturns a paginated list of rulesets for the specified list of Glue tables
list_dev_endpointsRetrieves the names of all DevEndpoint resources in this Amazon Web Services account, or the resources with the specified tag
list_jobsRetrieves the names of all job resources in this Amazon Web Services account, or the resources with the specified tag
list_ml_transformsRetrieves a sortable, filterable list of existing Glue machine learning transforms in this Amazon Web Services account, or the resources with the specified tag
list_registriesReturns a list of registries that you have created, with minimal registry information
list_schemasReturns a list of schemas with minimal details
list_schema_versionsReturns a list of schema versions that you have created, with minimal information
list_sessionsRetrieve a list of sessions
list_statementsLists statements for the session
list_table_optimizer_runsLists the history of previous optimizer runs for a specific table
list_triggersRetrieves the names of all trigger resources in this Amazon Web Services account, or the resources with the specified tag
list_workflowsLists names of workflows created in the account
put_data_catalog_encryption_settingsSets the security configuration for a specified catalog
put_resource_policySets the Data Catalog resource policy for access control
put_schema_version_metadataPuts the metadata key value pair for a specified schema version ID
put_workflow_run_propertiesPuts the specified workflow run properties for the given workflow run
query_schema_version_metadataQueries for the schema version metadata information
register_schema_versionAdds a new version to the existing schema
remove_schema_version_metadataRemoves a key value pair from the schema version metadata for the specified schema version ID
reset_job_bookmarkResets a bookmark entry
resume_workflow_runRestarts selected nodes of a previous partially completed workflow run and resumes the workflow run
run_statementExecutes the statement
search_tablesSearches a set of tables based on properties in the table metadata as well as on the parent database
start_blueprint_runStarts a new run of the specified blueprint
start_column_statistics_task_runStarts a column statistics task run, for a specified table and columns
start_crawlerStarts a crawl using the specified crawler, regardless of what is scheduled
start_crawler_scheduleChanges the schedule state of the specified crawler to SCHEDULED, unless the crawler is already running or the schedule state is already SCHEDULED
start_data_quality_rule_recommendation_runStarts a recommendation run that is used to generate rules when you don't know what rules to write
start_data_quality_ruleset_evaluation_runOnce you have a ruleset definition (either recommended or your own), you call this operation to evaluate the ruleset against a data source (Glue table)
start_export_labels_task_runBegins an asynchronous task to export all labeled data for a particular transform
start_import_labels_task_runEnables you to provide additional labels (examples of truth) to be used to teach the machine learning transform and improve its quality
start_job_runStarts a job run using a job definition
start_ml_evaluation_task_runStarts a task to estimate the quality of the transform
start_ml_labeling_set_generation_task_runStarts the active learning workflow for your machine learning transform to improve the transform's quality by generating label sets and adding labels
start_triggerStarts an existing trigger
start_workflow_runStarts a new run of the specified workflow
stop_column_statistics_task_runStops a task run for the specified table
stop_crawlerIf the specified crawler is running, stops the crawl
stop_crawler_scheduleSets the schedule state of the specified crawler to NOT_SCHEDULED, but does not stop the crawler if it is already running
stop_sessionStops the session
stop_triggerStops a specified trigger
stop_workflow_runStops the execution of the specified workflow run
tag_resourceAdds tags to a resource
untag_resourceRemoves tags from a resource
update_blueprintUpdates a registered blueprint
update_classifierModifies an existing classifier (a GrokClassifier, an XMLClassifier, a JsonClassifier, or a CsvClassifier, depending on which field is present)
update_column_statistics_for_partitionCreates or updates partition statistics of columns
update_column_statistics_for_tableCreates or updates table statistics of columns
update_connectionUpdates a connection definition in the Data Catalog
update_crawlerUpdates a crawler
update_crawler_scheduleUpdates the schedule of a crawler using a cron expression
update_databaseUpdates an existing database definition in a Data Catalog
update_data_quality_rulesetUpdates the specified data quality ruleset
update_dev_endpointUpdates a specified development endpoint
update_jobUpdates an existing job definition
update_job_from_source_controlSynchronizes a job from the source control repository
update_ml_transformUpdates an existing machine learning transform
update_partitionUpdates a partition
update_registryUpdates an existing registry which is used to hold a collection of schemas
update_schemaUpdates the description, compatibility setting, or version checkpoint for a schema set
update_source_control_from_jobSynchronizes a job to the source control repository
update_tableUpdates a metadata table in the Data Catalog
update_table_optimizerUpdates the configuration for an existing table optimizer
update_triggerUpdates a trigger definition
update_user_defined_functionUpdates an existing function definition in the Data Catalog
update_workflowUpdates an existing workflow

Examples

Run this code
if (FALSE) {
svc <- glue()
svc$batch_create_partition(
  Foo = 123
)
}

Run the code above in your browser using DataLab