abundant-judge-84756
11/21/2024, 11:14 AMoverwrite_cache=True
doesn't actually run new tasks and overwrite the cache as expected - instead Flyte tries to recover previous outputs and typically hits an OutputsNotFound
error due to some mismatch between the previous and new run.
We're using FlyteRemote to programatically create these executions as a way of relaunching previously failed executions - we do this instead of clicking 'relaunch' on the Flyte UI so that we can customise the execution name as well as relaunch executions in bulk.
Has anyone else had this issue, and is there perhaps something we're missing about our relaunching setup? I'll add a code snippet to the thread...abundant-judge-84756
11/21/2024, 11:15 AMexecution = remote.fetch_execution(domain=domain, name=execution_name)
execution_id = execution.id
inputs = remote.client.get_execution_data(execution_id).full_inputs.to_flyte_idl()
labels = execution.spec.labels.to_flyte_idl()
launch_plan = remote.fetch_launch_plan(domain=domain, name=execution.spec.launch_plan.name)
execution_spec = ExecutionSpec(
launch_plan=launch_plan.id.to_flyte_idl(),
metadata=execution.spec.metadata.to_flyte_idl(),
labels=labels,
overwrite_cache=True,
)
remote.client.raw.create_execution(
create_execution_request=ExecutionCreateRequest(
project=execution_id.project,
domain=execution_id.domain,
name=execution_retry_name,
spec=execution_spec,
inputs=inputs,
)
)
high-park-82026
abundant-judge-84756
11/22/2024, 8:34 AMhigh-park-82026
recover_execution
)...
cc @acceptable-policeman-57188 @high-accountant-32689acceptable-policeman-57188
remote.client.raw.recover_execution
instead: https://github.com/flyteorg/flytekit/blob/master/flytekit/clients/raw.py#L368high-park-82026
abundant-judge-84756
11/27/2024, 8:09 AMmetadata
line, and we haven't seen any unexpected recoveries since then 🤞 It's still not 100% clear if there was another reason why we might want to re-include the original execution metadata, or if we're safe to leave this out..high-park-82026
metadata
field it copies that mode (RecoveryMode) to new executions. You should probably build the metadata field yourself and do not set the mode or recovery execution id