Ruksana Kabealo
01/18/2023, 11:40 PMYee
DynamicJobSpec
object but basically it looks just like a workflows3 ls
that folder where the futures file is supposed to be?Ketan (kumare3)
Ruksana Kabealo
01/19/2023, 4:23 AMKetan (kumare3)
Ruksana Kabealo
01/19/2023, 4:45 AMKetan (kumare3)
Ruksana Kabealo
01/19/2023, 1:32 PMKevin Su
01/19/2023, 6:59 PMYee
logger:
show-source: true
level: 6
Ruksana Kabealo
01/20/2023, 10:45 PMYee
Dan Rammer (hamersaw)
01/23/2023, 5:12 PMaq4dj4xctvd84df9cvqm
) the error message in the logs is:
{
"json": {
"exec_id": "aq4dj4xctvd84df9cvqm",
"node": "n5/dn0",
"ns": "delaieine-development",
"res_ver": "266883",
"routine": "worker-3",
"wf": "delaieine:development:flyte.workflows.auto_train.pipeline"
},
"level": "error",
"msg": "handling parent node failed with error: InvalidArgument: Invalid fields for event message, caused by [rpc error: code = InvalidArgument desc = missing project]",
"ts": "2023-01-19T01:26:46Z"
}
This shows that propeller is failing to send a message to admin because of a 'missing project'. There may be some kind of version mismatch between propeller and admin - do you know what versions you're running?
(2) The Failed to read futures file
errors are printed out for other workflows (ie. not the one depicted). It looks like Flyte is trying to abort the workflow but is failing to abort. ex:
{
"json": {
"exec_id": "a8h2qqfdkxtqzhg49g22",
"node": "n5",
"ns": "delaieine-development",
"res_ver": "271443",
"routine": "worker-1",
"wf": "delaieine:development:flyte.workflows.auto_train.pipeline"
},
"level": "warning",
"msg": "Failed to read futures file. Error: path:<s3://my-s3-bucket/metadata/propeller/delaieine-development-a8h2qqfdkxtqzhg49g22/n5/data/0/futures.pb>: not found",
"ts": "2023-01-19T01:50:13Z"
}
followed by:
{
"json": {
"exec_id": "a8h2qqfdkxtqzhg49g22",
"ns": "delaieine-development",
"res_ver": "271443",
"routine": "worker-1",
"wf": "delaieine:development:flyte.workflows.auto_train.pipeline"
},
"level": "error",
"msg": "Failed to propagate Abort for workflow:project:\"delaieine\" domain:\"development\" name:\"a8h2qqfdkxtqzhg49g22\" . Error: []",
"ts": "2023-01-19T01:50:13Z"
}
Somehow the futures.pb
file is missing. So either (1) it was generated and deleted, corrupt, etc or (2) Flyte is looking for the file when it shouldn't be - this may be related to the event issue above.Ruksana Kabealo
01/24/2023, 5:09 PMDan Rammer (hamersaw)
03/03/2023, 9:49 AMRuksana Kabealo
03/06/2023, 3:18 PMDan Rammer (hamersaw)
03/06/2023, 9:42 PM