You are viewing an outdated version of the documentation.

This documentation is for an older version (1.4.7) of Dagster. You can view the version of this page from our latest release below.

Job-Level Versioning and Memoization (Deprecated)

Dagster has deprecated functionality that allows for job-level code versioning and memoization of previous op outputs based upon that versioning.

This is currently deprecated in favor of asset versioning.

Versioning

class dagster.VersionStrategy[source]

Abstract class for defining a strategy to version ops and resources.

When subclassing, get_op_version must be implemented, and get_resource_version can be optionally implemented.

get_op_version should ingest an OpVersionContext, and get_resource_version should ingest a ResourceVersionContext. From that, each synthesize a unique string called a version, which will be tagged to outputs of that op in the job. Providing a VersionStrategy instance to a job will enable memoization on that job, such that only steps whose outputs do not have an up-to-date version will run.

abstract get_op_version(context)[source]

Computes a version for an op.

Parameters:

context (OpVersionContext) – The context for computing the version.

Returns:

The version for the op.

Return type:

str

get_resource_version(context)[source]

Computes a version for a resource.

Parameters:

context (ResourceVersionContext) – The context for computing the version.

Returns:

The version for the resource. If None, the resource will not be

memoized.

Return type:

Optional[str]

class dagster.SourceHashVersionStrategy[source]

VersionStrategy that checks for changes to the source code of ops and resources.

Only checks for changes within the immediate body of the op/resource’s decorated function (or compute function, if the op/resource was constructed directly from a definition).

get_op_version(context)[source]

Computes a version for an op by hashing its source code.

Parameters:

context (OpVersionContext) – The context for computing the version.

Returns:

The version for the op.

Return type:

str

get_resource_version(context)[source]

Computes a version for a resource by hashing its source code.

Parameters:

context (ResourceVersionContext) – The context for computing the version.

Returns:

The version for the resource. If None, the resource will not be

memoized.

Return type:

Optional[str]

class dagster.OpVersionContext(op_def, op_config)[source]

Provides execution-time information for computing the version for an op.

op_def

The definition of the op to compute a version for.

Type:

OpDefinition

op_config

The parsed config to be passed to the op during execution.

Type:

Any

class dagster.ResourceVersionContext(resource_def, resource_config)[source]

Provides execution-time information for computing the version for a resource.

resource_def

The definition of the resource whose version will be computed.

Type:

ResourceDefinition

resource_config

The parsed config to be passed to the resource during execution.

Type:

Any

Memoization

class dagster.MemoizableIOManager[source]

Base class for IO manager enabled to work with memoized execution. Users should implement the load_input and handle_output methods described in the IOManager API, and the has_output method, which returns a boolean representing whether a data object can be found.

abstract has_output(context)[source]

The user-defined method that returns whether data exists given the metadata.

Parameters:

context (OutputContext) – The context of the step performing this check.

Returns:

True if there is data present that matches the provided context. False otherwise.

Return type:

bool

See also: dagster.IOManager.

dagster.MEMOIZED_RUN_TAG

Provide this tag to a run to toggle memoization on or off. {MEMOIZED_RUN_TAG: "true"} toggles memoization on, while {MEMOIZED_RUN_TAG: "false"} toggles memoization off.