If your instance is using the K8sRunLauncher, you can configure custom configuration for every run launched by Dagster by setting the k8sRunLauncher.runK8sConfig dictionary in the Helm chart.
k8sRunLauncher.runK8sConfig is a dictionary with the following keys:
The value for each of these keys is a dictionary with the YAML configuration for the underlying Kubernetes object. The Kubernetes object fields can be configured using either snake case (for example, volume_mounts) or camel case (volumeMounts). For example:
runLauncher:type: K8sRunLauncher
config:k8sRunLauncher:runK8sConfig:containerConfig:# raw config for the pod's main containerresources:limits:cpu: 100m
memory: 128Mi
podTemplateSpecMetadata:# raw config for the pod's metadataannotations:mykey: myvalue
podSpecConfig:# raw config for the spec of the launched's podnodeSelector:disktype: ssd
jobSpecConfig:# raw config for the kubernetes job's specttlSecondsAfterFinished:7200jobMetadata:# raw config for the kubernetes job's metadataannotations:mykey: myvalue
If your Dagster job is configured with the k8s_job_executor that runs each step in its own pod, configuration that you set in runK8sConfig will also be propagated to the pods that are created for each step, unless that step's configuration is overridden using one of the methods below.
If your instance is using the K8sRunLauncher or CeleryK8sRunLauncher, you can use the dagster-k8s/config tag on a Dagster job to pass custom configuration to the Kubernetes Jobs and Pods created by Dagster for that job.
dagster-k8s/config is a dictionary with the following keys:
The value for each of these keys is a dictionary with the YAML configuration for the underlying Kubernetes object. The Kubernetes object fields can be configured using either snake case (for example, volume_mounts) or camel case (volumeMounts). For example:
Other run launchers will ignore the dagster-k8s/config tag.
If your Dagster job is configured with the k8s_job_executor that runs each step in its own pod, configuration that you set on a job using the dagster-k8s/config tag will not be propagated to any of those step pods.
If your Dagster job is configured with the k8s_job_executor that runs each step in its own pod, you can use the step_k8s_config field on the executor to control the Kubernetes configuration for every step pod.
step_k8s_config is a dictionary with the following keys:
The value for each of these keys is a dictionary with the YAML configuration for the underlying Kubernetes object. The Kubernetes object fields can be configured using either snake case (for example, volume_mounts) or camel case (volumeMounts). For example:
Kubernetes configuration on individual steps in a run#
If your Dagster job is configured with the k8s_job_executor or celery_k8s_job_executor that run each step in its own Kubernetes pod, you can use the dagster-k8s/config tag on a Dagster op to control the Kubernetes configuration for that specific op.
As above when used on jobs, dagster-k8s/config is a dictionary with the following keys:
The value for each of these keys is a dictionary with the YAML configuration for the underlying Kubernetes object. The Kubernetes object fields can be configured using either snake case (for example, volume_mounts) or camel case (volumeMounts). For example:
If a Kubernetes configuration dictionary (like container_config) is specified at both the instance level in the Helm chart and in a specific Dagster job or op, the dictionaries will be shallowly merged. The more specific configuration takes precedence if the same key is set in both dictionaries.
Consider the following example:
In the Helm chart, k8sRunLauncher.runK8sConfig.podSpecConfig is set to:
Supplying .Values.postgresql.postgresqlPassword will create a Kubernetes Secret with key postgresql-password, containing the encoded password. This secret is used to supply the Dagster infrastructure with an environment variable that's used when creating the storages for the Dagster instance.
If you use a secrets manager like Vault, it may be convenient to manage this Secret outside of the Dagster Helm chart. In this case, the generation of this Secret within the chart should be disabled, and .Values.global.postgresqlSecretName should be set to the name of the externally managed Secret.
Users will likely want to permission a ServiceAccount bound to a properly scoped Role to launch Jobs and create other Kubernetes resources.
Users will likely want to use Secrets for managing secure information such as database logins.
Separately deploying Dagster infrastructure and user code#
It may be desirable to manage two Helm releases for your Dagster deployment: one release for the Dagster infrastructure, which consists of the Dagster webserver and the Dagster daemon, and another release for your User Code, which contains the definitions of your pipelines written in Dagster. This way, changes to User Code can be decoupled from upgrades to core Dagster infrastructure.
$ helm search repo dagster
NAME CHART VERSION APP VERSION DESCRIPTION
dagster/dagster 0.11.0 0.11.0 Dagster is a system for building modern data ap...
dagster/dagster-user-deployments 0.11.0 0.11.0 A Helm subchart to deploy Dagster User Code dep...
To manage these separate deployments, we first need to isolate Dagster infrastructure to its own deployment. This can be done by disabling the subchart that deploys the User Code in the dagster chart. This will prevent the dagster chart from creating the services and deployments related to User Code, as these will be managed in a separate release.
dagster-user-deployments:enableSubchart:false
Next, the workspace for the webserver must be configured with the future hosts and ports of the services exposing access to the User Code.
Finally, the dagster-user-deployments subchart can now be managed in its own release. The list of possible overrides for the subchart can be found in its values.yaml.
If you use a Kubernetes distribution that supports the TTL Controller, then Completed and FailedJobs (and their associated Pods) will be deleted after 1 day. The TTL value can be modified in your job tags: