Job Task
job_task(
task_key,
description = NULL,
depends_on = c(),
existing_cluster_id = NULL,
new_cluster = NULL,
job_cluster_key = NULL,
task,
libraries = NULL,
email_notifications = NULL,
timeout_seconds = NULL,
max_retries = 0,
min_retry_interval_millis = 0,
retry_on_timeout = FALSE
)
A unique name for the task. This field is used to refer to
this task from other tasks. This field is required and must be unique within
its parent job. On db_jobs_update()
or db_jobs_reset()
, this field is
used to reference the tasks to be updated or reset. The maximum length is
100 characters.
An optional description for this task. The maximum length is 4096 bytes.
Vector of task_key
's specifying the dependency graph of
the task. All task_key
's specified in this field must complete successfully
before executing this task. This field is required when a job consists of
more than one task.
ID of an existing cluster that is used for all runs of this task.
Instance of new_cluster()
.
Task is executed reusing the cluster specified in
db_jobs_create()
with job_clusters
parameter.
One of notebook_task()
, spark_jar_task()
,
spark_python_task()
, spark_submit_task()
, pipeline_task()
,
python_wheel_task()
.
Instance of libraries()
.
Instance of email_notifications.
An optional timeout applied to each run of this job task. The default behavior is to have no timeout.
An optional maximum number of times to retry an
unsuccessful run. A run is considered to be unsuccessful if it completes with
the FAILED
result_state
or INTERNAL_ERROR
life_cycle_state.
The value
-1 means to retry indefinitely and the value 0 means to never retry. The
default behavior is to never retry.
Optional minimal interval in milliseconds between the start of the failed run and the subsequent retry run. The default behavior is that unsuccessful runs are immediately retried.
Optional policy to specify whether to retry a task when it times out. The default behavior is to not retry on timeout.