You are viewing an outdated version of the documentation.

This documentation is for an older version (1.4.7) of Dagster. You can view the version of this page from our latest release below.

Executors#

Executors are responsible for executing steps within a job run. Once a run has launched and the process for the run (the run worker) is allocated and started, the executor assumes responsibility for execution.

Executors can range from single-process serial executors to managing per-step computational resources with a sophisticated control plane.


Relevant APIs#

NameDescription
@executorThe decorator used to define executors. Defines an ExecutorDefinition.
ExecutorDefinitionAn executor definition.

Specifying executors#

Directly on jobs#

Every job has an executor. The default executor is the multi_or_in_process_executor, which by default executes each step in its own process. This executor can be configured to execute each step within the same process.

An executor can be specified directly on a job by supplying an ExecutorDefinition to the executor_def parameter of @job or GraphDefinition.to_job:

from dagster import graph, job, multiprocess_executor


# Providing an executor using the job decorator
@job(executor_def=multiprocess_executor)
def the_job():
    ...


@graph
def the_graph():
    ...


# Providing an executor using graph_def.to_job(...)
other_job = the_graph.to_job(executor_def=multiprocess_executor)

For a code location#

To specify a default executor for all jobs and assets provided to a code location, supply the executor argument to the Definitions object.

If a job explicitly specifies an executor, then that executor will be used. Otherwise, jobs that don't specify an executor will use the default provided to the code location:

from dagster import multiprocess_executor, define_asset_job, asset, Definitions


@asset
def the_asset():
    pass


asset_job = define_asset_job("the_job", selection="*")


@job
def op_job():
    ...


# op_job and asset_job will both use the multiprocess_executor,
# since neither define their own executor.

defs = Definitions(
    assets=[the_asset], jobs=[asset_job, op_job], executor=multiprocess_executor
)

Note: Executing a job via JobDefinition.execute_in_process overrides the job's executor and uses in_process_executor instead.


Example executors#

NameDescription
in_process_executorExecution plan executes serially within the run worker itself.
multiprocess_executorExecutes each step within its own spawned process. Has a configurable level of parallelism.
dask_executorExecutes each step within a Dask task.
celery_executorExecutes each step within a Celery task.
docker_executorExecutes each step within a Docker container.
k8s_job_executorExecutes each step within an ephemeral Kubernetes pod.
celery_k8s_job_executorExecutes each step within a ephemeral Kubernetes pod, using Celery as a control plane for prioritization and queuing.
celery_docker_executorExecutes each step within a Docker container, using Celery as a control plane for prioritization and queueing.

Custom executors#

The executor system is pluggable, meaning it's possible to write your own executor to target a different execution substrate. Note that this is not currently well-documented and the internal APIs continue to be in flux.