proud-answer-87162
10/12/2023, 9:39 PMpyflyte
the task pod completes successfully and i see metadata written to my blob store; the n0
dir has both input and output and a dir is created for end-node
which contains inputs.pb
. however, the execution never finishes. it remains in the RUNNING
state and the end-node
task/event lists as QUEUED
when i look at execution details:
└── start-node - SUCCEEDED - 2023-10-12 21:27:35.186157884 +0000 UTC - 2023-10-12 21:27:35.186197485 +0000 UTC
└── n0 - SUCCEEDED - 2023-10-12 21:27:35.199313006 +0000 UTC - 2023-10-12 21:27:40.65606235 +0000 UTC
│ ├── Attempt :0
│ │ ├── Task - SUCCEEDED - 2023-10-12 21:27:35.262872565 +0000 UTC - 2023-10-12 21:27:40.644027555 +0000 UTC
│ │ ├── Task Type - python-task
│ │ ├── Reason - [ContainersNotReady|ContainerCreating]: containers with unready status: [f81959f07bf634b93b9c-n0-0]|
│ │ ├── Metadata
│ │ │ ├── Generated Name : f81959f07bf634b93b9c-n0-0
│ │ │ ├── Plugin Identifier : container
│ │ │ ├── External Resources
│ │ │ ├── Resource Pool Info
│ │ ├── Logs :
│ ├── Outputs :
│ └── o0: test-response
└── end-node - QUEUED - 2023-10-12 21:27:40.69112181 +0000 UTC - 2023-10-12 21:27:40.712353931 +0000 UTC
i don't see any error in the task pod but i do see this error in the flyte pod:
E1012 21:27:40.723260 7 workers.go:102] error syncing 'flyte-az-development/f81959f07bf634b93b9c': Operation cannot be fulfilled on <http://flyteworkflows.flyte.lyft.com|flyteworkflows.flyte.lyft.com> "f81959f07bf634b93b9c": the object has been modified; please apply your changes to the latest version and try again
i suspect i have something misconfigured but can't figure out what that is. i have tried both with an azure workload identity and raw creds and see the same behavior.
any ideas?proud-answer-87162
10/12/2023, 9:41 PMQUEUED
state and never runsproud-answer-87162
10/12/2023, 9:42 PMWorkflow execution not found in flyteadmin.
as an error in the task pod when i try to re-run the workflow. I don't know if that is related or just a transient errorproud-answer-87162
10/13/2023, 2:35 PMflyte-binary
and an image i built using the Makefile build_native_flyte
. the custom artifact doesn't have a code change but references a custom stow buildtall-lock-23197
thankful-minister-83577
thankful-minister-83577
thankful-minister-83577
proud-answer-87162
10/17/2023, 11:41 AME1012
above and stuff like:
2023/10/13 16:00:21 /go/pkg/mod/gorm.io/gorm@v1.24.1-0.20221019064659-5dd2bb482755/finisher_api.go:509
[0.722ms] [rows:1] SELECT count(*) FROM pg_indexes WHERE tablename = 'artifacts' AND indexname = 'artifacts_dataset_uuid_idx' AND schemaname = CURRENT_SCHEMA()
...
proud-answer-87162
10/17/2023, 11:41 AMproud-answer-87162
10/17/2023, 5:21 PM# logging Specify configuration for logs emitted by Flyte
logging:
# level Set the log level
level: 1
and this in my .flyte/config.yaml
file:
logger:
show-source: true
### I have tried a variety of different levels
level: 20
proud-answer-87162
10/17/2023, 6:34 PMvalues.yaml
)
propeller:
# disabled Disables flytepropeller
disabled: false
# disabledWebhook Disables webhook only
disableWebhook: false
publish-k8s-events: true
create-flyteworkflow-crd: true
thankful-minister-83577
thankful-minister-83577
proud-answer-87162
10/18/2023, 9:33 PMnode_execution_events
show a normal flow but end_node
remains QUEUED
. (i think) this is the log leading up to propeller trying to pick up that event:
likely pertinent block: [Operation cannot be fulfilled on <http://flyteworkflows.flyte.lyft.com|flyteworkflows.flyte.lyft.com> \"f42f996b920d8459b9fe\": the object has been modified; please apply your changes to the latest version and try again]","ts":"2023-10-18T20:22:10Z"}
2023/10/18 20:22:10 [32m/go/pkg/mod/gorm.io/gorm@v1.24.1-0.20221019064659-5dd2bb482755/callbacks.go:134
[0m[33m[3.650ms] [34;1m[rows:1][0m INSERT INTO "node_executions" ("created_at","updated_at","deleted_at","execution_project","execution_domain","execution_name","node_id","phase","input_uri","closure","started_at","node_execution_created_at","node_execution_updated_at","duration","node_execution_metadata","parent_id","parent_task_execution_id","error_kind","error_code","cache_status","dynamic_workflow_remote_closure_reference","internal_data") VALUES ('2023-10-18 20:22:10.162','2023-10-18 20:22:10.162',NULL,'flyte-az','development','f42f996b920d8459b9fe','end-node','QUEUED','<abfs://myflytetest/metadata/propeller/flyte-az-development-f42f996b920d8459b9fe/end-node/data/inputs.pb','><binary>',NULL,'2023-10-18 20:22:10.135','2023-10-18 20:22:10.16',0,'<binary>',NULL,NULL,NULL,NULL,NULL,'','<binary>') RETURNING "id"
{"json":{"exec_id":"f42f996b920d8459b9fe","node":"end-node","src":"noop_notifications.go:32"},"level":"debug","msg":"call to noop publish with notification type [flyteidl.admin.NodeExecutionEventRequest] and proto message [event:\u003cid:\u003cnode_id:\"end-node\" execution_id:\u003cproject:\"flyte-az\" domain:\"development\" name:\"f42f996b920d8459b9fe\" \u003e \u003e producer_id:\"propeller\" phase:QUEUED occurred_at:\u003cseconds:1697660530 nanos:135833222 \u003e input_uri:\"<abfs://myflytetest/metadata/propeller/flyte-az-development-f42f996b920d8459b9fe/end-node/data/inputs.pb>\" spec_node_id:\"end-node\" event_version:1 reported_at:\u003cseconds:1697660530 nanos:160164218 \u003e \u003e ]","ts":"2023-10-18T20:22:10Z"}
{"json":{"exec_id":"f42f996b920d8459b9fe","node":"end-node","src":"noop_notifications.go:32"},"level":"debug","msg":"call to noop publish with notification type [flyteidl.admin.NodeExecutionEventRequest] and proto message [event:\u003cid:\u003cnode_id:\"end-node\" execution_id:\u003cproject:\"flyte-az\" domain:\"development\" name:\"f42f996b920d8459b9fe\" \u003e \u003e producer_id:\"propeller\" phase:QUEUED occurred_at:\u003cseconds:1697660530 nanos:135833222 \u003e input_uri:\"<abfs://myflytetest/metadata/propeller/flyte-az-development-f42f996b920d8459b9fe/end-node/data/inputs.pb>\" spec_node_id:\"end-node\" event_version:1 reported_at:\u003cseconds:1697660530 nanos:160164218 \u003e \u003e ]","ts":"2023-10-18T20:22:10Z"}
{"json":{"exec_id":"f42f996b920d8459b9fe","node":"end-node","ns":"flyte-az-development","res_ver":"342094168","routine":"worker-11","src":"executor.go:991","wf":"flyte-az:development:workflows.simple-workflow.simple_workflow"},"level":"debug","msg":"Node pre-execute completed","ts":"2023-10-18T20:22:10Z"}
{"json":{"exec_id":"f42f996b920d8459b9fe","node":"end-node","ns":"flyte-az-development","res_ver":"342094168","routine":"worker-11","src":"executor.go:1174","wf":"flyte-az:development:workflows.simple-workflow.simple_workflow"},"level":"info","msg":"Node was queued, parallelism is now [1]","ts":"2023-10-18T20:22:10Z"}
{"json":{"exec_id":"f42f996b920d8459b9fe","node":"end-node","ns":"flyte-az-development","res_ver":"342094168","routine":"worker-11","src":"executor.go:1176","wf":"flyte-az:development:workflows.simple-workflow.simple_workflow"},"level":"debug","msg":"Completed node [end-node]","ts":"2023-10-18T20:22:10Z"}
{"json":{"exec_id":"f42f996b920d8459b9fe","ns":"flyte-az-development","res_ver":"342094168","routine":"worker-11","src":"executor.go:403","wf":"flyte-az:development:workflows.simple-workflow.simple_workflow"},"level":"info","msg":"Handling Workflow [f42f996b920d8459b9fe] Done","ts":"2023-10-18T20:22:10Z"}
{"json":{"exec_id":"f42f996b920d8459b9fe","ns":"flyte-az-development","routine":"worker-11","src":"passthrough.go:89"},"level":"debug","msg":"Observed FlyteWorkflow Update (maybe finalizer)","ts":"2023-10-18T20:22:10Z"}
{"json":{"src":"controller.go:206"},"level":"info","msg":"Deletion triggered for f42f996b920d8459b9fe","ts":"2023-10-18T20:22:10Z"}
2023/10/18 20:22:10 [32m/go/pkg/mod/gorm.io/gorm@v1.24.1-0.20221019064659-5dd2bb482755/callbacks.go:134
[0m[33m[4.106ms] [34;1m[rows:1][0m INSERT INTO "node_execution_events" ("created_at","updated_at","deleted_at","execution_project","execution_domain","execution_name","node_id","request_id","occurred_at","phase") VALUES ('2023-10-18 20:22:10.166','2023-10-18 20:22:10.166',NULL,'flyte-az','development','f42f996b920d8459b9fe','end-node','','2023-10-18 20:22:10.135','QUEUED') RETURNING "id"
{"json":{"exec_id":"f42f996b920d8459b9fe","ns":"flyte-az-development","routine":"worker-11","src":"passthrough.go:104"},"level":"error","msg":"Failed to update workflow. Error [Operation cannot be fulfilled on flyteworkflows.flyte.lyft.com \"f42f996b920d8459b9fe\": the object has been modified; please apply your changes to the latest version and try again]","ts":"2023-10-18T20:22:10Z"}
{"json":{"exec_id":"f42f996b920d8459b9fe","ns":"flyte-az-development","routine":"worker-11","src":"handler.go:362"},"level":"info","msg":"Completed processing workflow.","ts":"2023-10-18T20:22:10Z"}
E1018 20:22:10.173073 7 workers.go:102] error syncing 'flyte-az-development/f42f996b920d8459b9fe': Operation cannot be fulfilled on flyteworkflows.flyte.lyft.com "f42f996b920d8459b9fe": the object has been modified; please apply your changes to the latest version and try again
{"json":{"exec_id":"f42f996b920d8459b9fe","ns":"flyte-az-development","routine":"worker-11","src":"handler.go:181"},"level":"info","msg":"Processing Workflow.","ts":"2023-10-18T20:22:10Z"}
{"json":{"exec_id":"f42f996b920d8459b9fe","ns":"flyte-az-development","routine":"worker-11","src":"passthrough.go:40"},"level":"warning","msg":"Workflow not found in cache.","ts":"2023-10-18T20:22:10Z"}
{"json":{"exec_id":"f42f996b920d8459b9fe","ns":"flyte-az-development","routine":"worker-11","src":"handler.go:189"},"level":"warning","msg":"Workflow namespace[flyte-az-development]/name[f42f996b920d8459b9fe] not found, may be deleted.","ts":"2023-10-18T20:22:10Z"}
proud-answer-87162
10/18/2023, 9:33 PMselect * from node_execution_events where execution_name = 'f42f996b920d8459b9fe';
id | created_at | updated_at | deleted_at | execution_project | execution_domain | execution_name | node_id | request_id | occurred_at | phase
----+-------------------------------+-------------------------------+------------+-------------------+------------------+----------------------+------------+------------+-------------------------------+-----------
28 | 2023-10-18 20:22:04.307004+00 | 2023-10-18 20:22:04.307004+00 | | flyte-az | development | f42f996b920d8459b9fe | start-node | | 2023-10-18 20:22:04.299654+00 | SUCCEEDED
29 | 2023-10-18 20:22:04.344218+00 | 2023-10-18 20:22:04.344218+00 | | flyte-az | development | f42f996b920d8459b9fe | n0 | | 2023-10-18 20:22:04.328458+00 | QUEUED
30 | 2023-10-18 20:22:04.480383+00 | 2023-10-18 20:22:04.480383+00 | | flyte-az | development | f42f996b920d8459b9fe | n0 | | 2023-10-18 20:22:04.452153+00 | RUNNING
31 | 2023-10-18 20:22:10.108206+00 | 2023-10-18 20:22:10.108206+00 | | flyte-az | development | f42f996b920d8459b9fe | n0 | | 2023-10-18 20:22:10.089884+00 | SUCCEEDED
32 | 2023-10-18 20:22:10.166419+00 | 2023-10-18 20:22:10.166419+00 | | flyte-az | development | f42f996b920d8459b9fe | end-node | | 2023-10-18 20:22:10.135833+00 | QUEUED
proud-answer-87162
10/18/2023, 9:34 PMproud-answer-87162
10/18/2023, 9:34 PMwf
in the tutorialproud-answer-87162
10/18/2023, 9:39 PMproud-answer-87162
10/20/2023, 12:47 PMthankful-minister-83577
thankful-minister-83577
proud-answer-87162
10/20/2023, 12:49 PMthankful-minister-83577
proud-answer-87162
10/20/2023, 12:51 PM