job_task: Job Task

Description

Job Task

Usage

job_task(
  task_key,
  description = NULL,
  depends_on = c(),
  existing_cluster_id = NULL,
  new_cluster = NULL,
  job_cluster_key = NULL,
  task,
  libraries = NULL,
  email_notifications = NULL,
  timeout_seconds = NULL,
  max_retries = 0,
  min_retry_interval_millis = 0,
  retry_on_timeout = FALSE
)

Arguments

task_key: A unique name for the task. This field is used to refer to this task from other tasks. This field is required and must be unique within its parent job. On db_jobs_update() or db_jobs_reset(), this field is used to reference the tasks to be updated or reset. The maximum length is 100 characters.
description: An optional description for this task. The maximum length is 4096 bytes.
depends_on: Vector of task_key's specifying the dependency graph of the task. All task_key's specified in this field must complete successfully before executing this task. This field is required when a job consists of more than one task.
existing_cluster_id: ID of an existing cluster that is used for all runs of this task.
new_cluster: Instance of new_cluster().
job_cluster_key: Task is executed reusing the cluster specified in db_jobs_create() with job_clusters parameter.
task: One of notebook_task(), spark_jar_task(), spark_python_task(), spark_submit_task(), pipeline_task(), python_wheel_task().
libraries: Instance of libraries().
email_notifications: Instance of email_notifications.
timeout_seconds: An optional timeout applied to each run of this job task. The default behavior is to have no timeout.
max_retries: An optional maximum number of times to retry an unsuccessful run. A run is considered to be unsuccessful if it completes with the FAILED result_state or INTERNAL_ERROR life_cycle_state. The value -1 means to retry indefinitely and the value 0 means to never retry. The default behavior is to never retry.
min_retry_interval_millis: Optional minimal interval in milliseconds between the start of the failed run and the subsequent retry run. The default behavior is that unsuccessful runs are immediately retried.
retry_on_timeout: Optional policy to specify whether to retry a task when it times out. The default behavior is to not retry on timeout.