ambitious-australia-27749
01/18/2023, 11:40 PMambitious-australia-27749
01/19/2023, 1:23 AMthankful-minister-83577
thankful-minister-83577
thankful-minister-83577
thankful-minister-83577
DynamicJobSpec
object but basically it looks just like a workflowthankful-minister-83577
thankful-minister-83577
thankful-minister-83577
s3 ls
that folder where the futures file is supposed to be?freezing-airport-6809
ambitious-australia-27749
01/19/2023, 4:23 AMambitious-australia-27749
01/19/2023, 4:27 AMfreezing-airport-6809
freezing-airport-6809
ambitious-australia-27749
01/19/2023, 4:45 AMfreezing-airport-6809
ambitious-australia-27749
01/19/2023, 1:32 PMglamorous-carpet-83516
01/19/2023, 6:59 PMthankful-minister-83577
logger:
show-source: true
level: 6
thankful-minister-83577
thankful-minister-83577
thankful-minister-83577
ambitious-australia-27749
01/20/2023, 10:45 PMambitious-australia-27749
01/20/2023, 10:45 PMthankful-minister-83577
hallowed-mouse-14616
01/23/2023, 5:12 PMaq4dj4xctvd84df9cvqm
) the error message in the logs is:
{
"json": {
"exec_id": "aq4dj4xctvd84df9cvqm",
"node": "n5/dn0",
"ns": "delaieine-development",
"res_ver": "266883",
"routine": "worker-3",
"wf": "delaieine:development:flyte.workflows.auto_train.pipeline"
},
"level": "error",
"msg": "handling parent node failed with error: InvalidArgument: Invalid fields for event message, caused by [rpc error: code = InvalidArgument desc = missing project]",
"ts": "2023-01-19T01:26:46Z"
}
This shows that propeller is failing to send a message to admin because of a 'missing project'. There may be some kind of version mismatch between propeller and admin - do you know what versions you're running?
(2) The Failed to read futures file
errors are printed out for other workflows (ie. not the one depicted). It looks like Flyte is trying to abort the workflow but is failing to abort. ex:
{
"json": {
"exec_id": "a8h2qqfdkxtqzhg49g22",
"node": "n5",
"ns": "delaieine-development",
"res_ver": "271443",
"routine": "worker-1",
"wf": "delaieine:development:flyte.workflows.auto_train.pipeline"
},
"level": "warning",
"msg": "Failed to read futures file. Error: path:<s3://my-s3-bucket/metadata/propeller/delaieine-development-a8h2qqfdkxtqzhg49g22/n5/data/0/futures.pb>: not found",
"ts": "2023-01-19T01:50:13Z"
}
followed by:
{
"json": {
"exec_id": "a8h2qqfdkxtqzhg49g22",
"ns": "delaieine-development",
"res_ver": "271443",
"routine": "worker-1",
"wf": "delaieine:development:flyte.workflows.auto_train.pipeline"
},
"level": "error",
"msg": "Failed to propagate Abort for workflow:project:\"delaieine\" domain:\"development\" name:\"a8h2qqfdkxtqzhg49g22\" . Error: []",
"ts": "2023-01-19T01:50:13Z"
}
Somehow the futures.pb
file is missing. So either (1) it was generated and deleted, corrupt, etc or (2) Flyte is looking for the file when it shouldn't be - this may be related to the event issue above.ambitious-australia-27749
01/24/2023, 5:09 PMambitious-australia-27749
01/24/2023, 5:14 PMambitious-australia-27749
03/01/2023, 3:11 PMhallowed-mouse-14616
03/03/2023, 9:49 AMambitious-australia-27749
03/06/2023, 3:18 PMhallowed-mouse-14616
03/06/2023, 9:42 PM