Users shall be able to update existing jobs #3871
Labels
th/production-readiness
Get ready for production workloads
type/epic
Type: A higher level set of issues
The Problem
With the introduction of long running jobs, users shall be able to update the specs of active jobs where the orchestrator shall deploy updated jobs in place if possible, or select new compute nodes. Today users must cancel previous jobs and submit new ones whenever they want to update their jobs, which is not efficient, introduce gaps in execution, and discard versioning and history of previous job instances.
Updates include and not limited to:
Requirements
More info can be found here
Open Questions:
bacalhau get
andbacalhau describe
will always return results and status of the latest job version. User will have to pass a new--version <int>
flag to describe or download results of previous versions.bacalhau stop
on the other hand will stop all active executions, including from previous versions. Users will have to pass--version <int>
to stop a specific versionThe text was updated successfully, but these errors were encountered: