Gerry Meixiong
03/10/2023, 10:57 PMWorkflow execution not found in flyteadmin.
which I don’t fully understand.Ketan (kumare3)
03/10/2023, 11:07 PMViljem Skornik
03/10/2023, 11:08 PM0m[33m[0.752ms] [34;1m[rows:1][0m SELECT * FROM "executions" WHERE "executions"."execution_project" = 'flyte-dev' AND "executions"."execution_domain" = 'production' AND "executions"."execution_name" = 're123' LIMIT 1
[0m[33m[7.721ms] [34;1m[rows:0][0m SELECT * FROM "executions" WHERE "executions"."execution_project" = 'flyte-dev' AND "executions"."execution_domain" = 'production' AND "executions"."execution_name" = 're123' LIMIT 1
Failed to find existing execution with id [project:"flyte-dev" domain:"production" name:"re123" ] with err: missing entity of type execution with identifier project:"flyte-dev" domain:"production" name:"re123"
Failed to record task event [task_id:<resource_type:TASK project:"flyte-dev" domain:"production"....
Then we end up with KillTask invoked and Deletion Triggered fro re123.
But then 4 minutes later. it’s back:
[0m[33m[0.752ms] [34;1m[rows:1][0m SELECT * FROM "executions" WHERE "executions"."execution_project" = 'flyte-dev' AND "executions"."execution_domain" = 'production' AND "executions"."execution_name" = 're123' LIMIT
Ketan (kumare3)
03/10/2023, 11:24 PMKevin Su
03/10/2023, 11:30 PMGerry Meixiong
03/10/2023, 11:32 PMSome node execution failed, auto-abort.
and the workflow also states Workflow execution not found in flyteadmin.
Viljem Skornik
03/10/2023, 11:39 PMKetan (kumare3)
03/10/2023, 11:40 PMViljem Skornik
03/10/2023, 11:42 PMKetan (kumare3)
03/11/2023, 12:20 AMYee
03/11/2023, 12:23 AMThe node execution launched an execution but it does not exist
Ketan (kumare3)
03/11/2023, 12:23 AMYee
03/11/2023, 12:23 AMGerry Meixiong
03/11/2023, 12:28 AM/go/src/github.com/flyteorg/flyteadmin/pkg/repositories/gormimpl/task_execution_repo.go:55 record not found
in the admin logsViljem Skornik
03/11/2023, 1:02 AMDavid Espejo (he/him)
03/13/2023, 4:05 PMViljem Skornik
03/13/2023, 7:10 PMAttempt 01
aborted
Some node execution failed, auto-abort.
Cant find anything useful. Kube logs show pod getting terminated, nothing out of the ordinary, except that the pod is then gone vs left in Successful state.
propeller logs show nothing other than handling Abort event…