hallowed-wall-36631
07/29/2024, 10:23 PMFlyteUserException
raised in ContainerTask
back to Flyte workflow. I have no problem propagating FlyteUserException
from decorated tasks (@task
) though. Details in 🧵hallowed-wall-36631
07/29/2024, 10:24 PM@task
) :
• Flyte-workflow ↔️ Python-Task
◦ raising FlyteUserException
in Python-Task
correctly propagates the exception to workflow.
Raw-Containers (ContainerTask
):
• Flyte-workflow ↔️ Raw-Container(ContainerTask)
◦ raising FlyteUserException
in Raw-Container
does not propagate the exception to workflow.
◦ the logs do have exceptions raised though.
In ContainerTask
, it only results in a non-zero exit code returned by container. The nuance of python exception FlyteUserException
is lost between the container and task running the container - which is not surprising as Docker containers are not expected to propagate Python exceptions. So to get around this limitation, we were thinking of then inspecting the logs of container and then raise the same FlyteUserException
again to be caught by Workflow. For that, we overrode ContainerTask.execute() [link] based on documentation. But what we observed is the overridden method execute
is not invoked at all.
Questions:
• for our use case of passing on FlyteUserException
from a ContainerTask
to a workflow , do you think overriding execute
is the way to go? Or is there a better approach?
• if overriding execute
is a good approach, why is our code not working? I have a sample code below that captures what we are attempting.
class CustomContainerTask(ContainerTask):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
def execute(self, **kwargs) -> LiteralMap:
client = docker.from_env()
container = client.containers.run(self.image, self.command)
if "FlyteUserException" in container:
raise FlyteException("Exception in execute()")
return container
high-accountant-32689
07/30/2024, 7:23 PM@task
?hallowed-wall-36631
07/30/2024, 7:32 PMhigh-accountant-32689
07/31/2024, 8:37 PMFlyteUserException
and its use inside flytekit
. In order to run user code, flytekit
follows an implicit protocol under the covers. If you read this code in entrypoint.py you'll notice that we encode the outputs of a task into a LiteralMap
protobuf message and write it to a specific location, this is also where exceptions are handled, we write a special protobuf file called error.pb
and encode the exception in it. Unfortunately, the current implementation of raw containers (which are handled by flytecopilot) doesn't follow that protocol to handle errors. This would be a wonderful extension to raw containers and I'd be more than happy to collaborate on a PR.
That said, have you explored Shell Tasks? Error handling in those is not great either, but they might unblock your use-case quicker.hallowed-wall-36631
07/31/2024, 9:58 PMhallowed-wall-36631
07/31/2024, 10:02 PMLiteralMap
- isn’t that whats already done in ContainerTask method execute?high-accountant-32689
08/01/2024, 2:03 PMexecute
in ContainerTask
is only invoked during local executions (I'll make a note to clarify that in either the docstring of in the actual method definition).
The way the literal map is composed in the case of remote invocations of of raw containers is handled in https://github.com/flyteorg/flyte/blob/025296a61105bdb8f7932a7f15af8cd0aefc4a5e/flytecopilot/data/upload.go#L117-L191. Notice how we go from local files written by the container to a single LiteralMap
object that's written to the meta bucket.
In other words, the protocol followed by raw containers at runtime does not involve flytekit at all. We assume that the user code is going to write (protobuf) files to the output dir and the flytecopilot sidecar turns those into a single output LiteralMap
after the container finishes.high-accountant-32689
08/01/2024, 2:05 PMhallowed-wall-36631
08/06/2024, 7:58 PM