proud-glass-36655
12/03/2024, 6:37 PMmap_task
failed out with a very long error message (it hit a SQL error on a large insert), but the containing map_task
(Array Node
on the UI) and the overarching workflow were just stuck in a Running
state until I manually went in and terminated the workflow.
I'm guessing the cause of this issue was the size of the error message, but is there something I should be looking for in the logs to confirm this? Should Flyte be handling this sort of situation better (assuming it is the issue that I'm guessing)?proud-glass-36655
12/03/2024, 6:38 PMflytekit==1.13.5
average-finland-92144
12/04/2024, 10:21 AMproud-glass-36655
12/04/2024, 2:45 PM[0]: [3/3] currentAttempt done. Last Error: SYSTEM::resource not found, name [[project-name]-production/f139292485c6b2d39000-fhg3lf3i-0-n0-3]. reason: pods "f139292485c6b2d39000-fhg3lf3i-0-n0-3" not found
which is because the underlying task failed out but Flyte wasn't aware of it, I'm guessing