Hi all, I’m seeing retries 3 and counting on a wor...
# ask-the-community
g
Hi all, I’m seeing retries 3 and counting on a workflow that’s throwing
RuntimeError
. According to the docs, I thought this would be considered non-recoverable and thus not retried? I’l put the stack trace in the thread.
Copy code
Traceback (most recent call last):

      File "/app/pip_pypi__flytekit/flytekit/exceptions/scopes.py", line 203, in user_entry_point
        return wrapped(*args, **kwargs)
      File "/app/atomwise/...", line 266, in run_block
        block.validate_output(cmpds, result)
      File "/app/atomwise/..., line 119, in validate_output
        raise RuntimeError(

Message:

    <blah>

User error.
k
seems like a bug. mind creating an issue here. [flyte-bug]
g
d
@Kevin Su are you sure this is a bug? Propeller handles retryable failures by designating them as either
USER
or
SYSTEM
errors where
SYSTEM
are retryable. Do we know how the
RuntimeError
is handled here? My intuition is that Flyte is more conservative about labeling a failure as non-retryable and wouldn't map all python
RuntimeError
instances as
USER
errors and therefore non-retryable.
k
if task throws a
RuntimeError
, flytekit will convert it to
FlyteScopedUserException
(User error) here. it’s retryable right now. cc @Yee
d
So after diving through the code I was wrong, the
USER
or
SYSTEM
identifier only has to do with the number of times a failure is retried (which can be configured to mean one never retries). In this scenario it seems that flytekit will designate the specified error as retryable unless the the user specifically throws a non-recoverable exception. I imagine there have been many discussions about what the preferred behavior is here, regardless from the flytekit side, if you just raise a non-recoverable error it should have the intended behavior. TL;DR @Greg Friedland in the issue it seems that you're manually throwing a runtime exception. Is there any reason you could not just manually designate a Flyte non-recoverable exception instead?
k
yup, flytekit marks
runtimeError
as retryable. To workaround it, you could catch that error, and throw a Flyte non-recoverable exception. Just discussed with Yee, all the non-flyte errors raised from user code should be non-recoverable. we’ll fix it in flytekit side.
g
I tried throwing
FlyteUserException
and got the same behavior. Is there another exception type you suggest?
y
you are trying to get things to not re-run? i’m not able to repro this actually.
Copy code
@task(retries=10)
def say_hello_error(a: int) -> str:
    if a == 1:
        print("Raising Error")
        raise RuntimeError("my runtime error")

    return "hi"
this task runs once for me.
g
You’re right, I wasn’t able to reproduce it with a plain task. When I nested dynamic tasks, I do see it however. I added a minimal repro to the bug ticket
155 Views