Hi there! Has anyone encountered issues with <ignore-retry-cause=true>? We set it in platform config...
s
Hi there! Has anyone encountered issues with ignore-retry-cause=true? We set it in platform config, running flyte-core
1.12.0
and our user errors don't result in retries when retries are set in task decorator.
More info: Settings in
flyte-propeller-config
Copy code
core.yaml:
----
propeller:
  node-config:
    ignore-retry-cause: true
    interruptible-failure-threshold: 1
Task decorator:
Copy code
@task(
    cache=False,
    interruptible=True,
    retries=3,
)
Flytekit version:
1.12.3
Expected behavior: Flyte will retry 3 times even if user error Actual behavior: Flyte fails on 1st attempt (user error)
c
It's not super clear but I believe user retries are cooperative and require the task code to use
raise FlyteRecoverableException
https://docs.flyte.org/en/latest/user_guide/flyte_fundamentals/optimizing_tasks.html#configuring-retries Otherwise task code errors are considered terminal
ignore-retry-cause
is more about user vs system and how it counts towards a single budget
s
Thanks @clean-glass-36808! You're right - I misread https://github.com/flyteorg/flyte/pull/5128 and thought it was doing the reverse, changing the behavior from
FlyteRecoverableException
to any
RuntimeError
would raise a user error retry. We'll change our exception to
FlyteRecoverableException
. Appreciate your help!
c
thanks, @clean-glass-36808!