Hello everyone, Recently I installed Flyte 1.16.1 ...
# flyte-support
g
Hello everyone, Recently I installed Flyte 1.16.1 with flyte-connector. I created an image for the flyte-connector and tried to deploy a workflow. Unfortunately the deployment is not running, probably due to the connector. The flyte-connector pod is up and running. I only found the next errors in Propller service. In the past it was working on Flyte 1.15.1. Any idea what is the issue? I have no idea how to overcome this issue
Copy code
{"json":{"exec_id":"axtjrfrj8xck879d2xn8","ns":"flyte-staging","res_ver":"5768777206","routine":"worker-2","wf":"flyte:staging:car_early_claim_flyte.workflows.train.train_workflow"},"level":"error","msg":"panic when reconciling workflow, Stack: [goroutine 622 [running]:\nruntime/debug.Stack()\n\t/usr/local/go/src/runtime/debug/stack.go:26 +0x5e\<http://ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller.(*Propeller).TryMutateWorkflow.func2.1()|ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller.(*Propeller).TryMutateWorkflow.func2.1()>\n\t/go/src/github.com/flyteorg/flytepropeller/pkg/controller/handler.go:137 +0x6e\npanic({0x26a9c40?, 0x47d2a90?})\n\t/usr/local/go/src/runtime/panic.go:791 +0x132\<http://ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes/task.(*Handler).IsCacheable(0xc00070c000|ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes/task.(*Handler).IsCacheable(0xc00070c000>, {0x30be230, 0xc001fa8780}, {0x30d5f70, 0xc001785980})\n\t/go/src/github.com/flyteorg/flytepropeller/pkg/controller/nodes/task/cache.go:46 +0x1c9\<http://ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes.(*nodeExecutor).handleNotYetStartedNode(0xc001386000|ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes.(*nodeExecutor).handleNotYetStartedNode(0xc001386000>, {0x30be230, 0xc001fa8780}, {0x30a1990, 0xc0007d6f08}, {0x30d5f70, 0xc001785980}, {0x30c3280, 0xc000594280})\n\t/go/src/github.com/flyteorg/flytepropeller/pkg/controller/nodes/executor.go:981 +0x674\<http://ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes.(*nodeExecutor).HandleNode(0xc001386000|ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes.(*nodeExecutor).HandleNode(0xc001386000>, {0x30be230?, 0xc001fa8390?}, {0x30a1990, 0xc0007d6f08}, {0x30d5f70, 0xc001785980}, {0x30c3280, 0xc000594280})\n\t/go/src/github.com/flyteorg/flytepropeller/pkg/controller/nodes/executor.go:1365 +0xefb\<http://ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes.(*recursiveNodeExecutor).RecursiveNodeHandler(0xc001388000|ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes.(*recursiveNodeExecutor).RecursiveNodeHandler(0xc001388000>, {0x30be230, 0xc001f50cc0}, {0x30e0010, 0xc0031b11d0}, {0x30a1990, 0xc0007d6f08}, {0x30bef98, 0xc0007d6f08}, {0x30da360, ...})\n\t/go/src/github.com/flyteorg/flytepropeller/pkg/controller/nodes/executor.go:234 +0x8fc\<http://ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes.(*recursiveNodeExecutor).handleDownstream(0xc001388000|ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes.(*recursiveNodeExecutor).handleDownstream(0xc001388000>, {0x30be230, 0xc001f50cc0}, {0x30e0010, 0xc0031b11d0}, {0x30a1990, 0xc0007d6f08}, {0x30bef98, 0xc0007d6f08}, {0x30da360, ...})\n\t/go/src/github.com/flyteorg/flytepropeller/pkg/controller/nodes/executor.go:300 +0x47a\<http://ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes.(*recursiveNodeExecutor).RecursiveNodeHandler(0xc001388000|ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes.(*recursiveNodeExecutor).RecursiveNodeHandler(0xc001388000>, {0x30be230, 0xc001f50cc0}, {0x30e0010, 0xc0031b11d0}, {0x30a1990, 0xc0007d6f08}, {0x30bef98, 0xc0007d6f08}, {0x30da360, ...})\n\t/go/src/github.com/flyteorg/flytepropeller/pkg/controller/nodes/executor.go:241 +0xb19\<http://ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller/workflow.(*workflowExecutor).handleRunningWorkflow(0xc00020b080|ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller/workflow.(*workflowExecutor).handleRunningWorkflow(0xc00020b080>, {0x30be230, 0xc001f50cc0}, 0xc0007d6f08)\n\t/go/src/github.com/flyteorg/flytepropeller/pkg/controller/workflow/executor.go:175 +0x1bb\<http://ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller/workflow.(*workflowExecutor).HandleFlyteWorkflow(0xc00020b080|ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller/workflow.(*workflowExecutor).HandleFlyteWorkflow(0xc00020b080>, {0x30be230, 0xc001f50cc0}, 0xc0007d6f08)\n\t/go/src/github.com/flyteorg/flytepropeller/pkg/controller/workflow/executor.go:432 +0x3af\<http://ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller.(*Propeller).TryMutateWorkflow.func2(0xc000687500|ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller.(*Propeller).TryMutateWorkflow.func2(0xc000687500>, {0x30be230, 0xc001f50cc0}, 0xc001083820, 0xc0007d6f08)\n\t/go/src/github.com/flyteorg/flytepropeller/pkg/controller/handler.go:143 +0x15d\<http://ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller.(*Propeller).TryMutateWorkflow(0xc000687500|ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller.(*Propeller).TryMutateWorkflow(0xc000687500>, {0x30be230, 0xc001f50180}, 0xc0007d7908)\n\t/go/src/github.com/flyteorg/flytepropeller/pkg/controller/handler.go:144 +0x3be\<http://ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller.(*Propeller).streak(0xc000687500|ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller.(*Propeller).streak(0xc000687500>, {0x30be230?, 0xc001f50030?}, 0xc0007d7908, 0x0)\n\t/go/src/github.com/flyteorg/flytepropeller/pkg/controller/handler.go:301 +0x271\<http://ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller.(*Propeller).Handle(0xc000687500|ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller.(*Propeller).Handle(0xc000687500>, {0x30be230?, 0xc0017c3f50?}, {0xc00266d620, 0xd}, {0xc00266d62e, 0x14})\n\t/go/src/github.com/flyteorg/flytepropeller/pkg/controller/handler.go:244 +0x951\<http://ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller.(*WorkerPool).processNextWorkItem.func1(0xc0010b0240|ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller.(*WorkerPool).processNextWorkItem.func1(0xc0010b0240>, 0xc001083f30, {0x24f49c0, 0xc00189cd40})\n\t/go/src/github.com/flyteorg/flytepropeller/pkg/controller/workers.go:89 +0x4c3\<http://ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller.(*WorkerPool).processNextWorkItem(0xc0010b0240|ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller.(*WorkerPool).processNextWorkItem(0xc0010b0240>, {0x30be230, 0xc0017c3f50})\n\t/go/src/github.com/flyteorg/flytepropeller/pkg/controller/workers.go:100 +0xce\<http://ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller.(*WorkerPool).runWorker(0xc0010b0240|ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller.(*WorkerPool).runWorker(0xc0010b0240>, {0x30be230, 0xc0008206f0})\n\t/go/src/github.com/flyteorg/flytepropeller/pkg/controller/workers.go:116 +0x9a\<http://ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller.(*WorkerPool).Run.func1()|ngithub.com/flyteorg/flyte/flytepropeller/pkg/controller.(*WorkerPool).Run.func1()>\n\t/go/src/github.com/flyteorg/flytepropeller/pkg/controller/workers.go:151 +0x4f\ncreated by <http://github.com/flyteorg/flyte/flytepropeller/pkg/controller.(*WorkerPool).Run|github.com/flyteorg/flyte/flytepropeller/pkg/controller.(*WorkerPool).Run> in goroutine 32\n\t/go/src/github.com/flyteorg/flytepropeller/pkg/controller/workers.go:148 +0x272\n]","ts":"2025-09-30T09:53:15Z"}
f
This is a panic
You are saying this was working in 1.15?
g
Correct, in 1.15 it was working while in 1.16.1 it's not working. Also in 1.16.1 i needed to update the values in the chart since in 1.15.1 it was:
flyteagent: {
enabled: true,
nameOverride: "flyteconnector",
image: {
repository: flyteConnectorRepo,
tag: "latest",
pullPolicy: "Always",
},
serviceAccount: {
annotations: {
'<http://eks.amazonaws.com/role-arn|eks.amazonaws.com/role-arn>': serviceRoleArn
}
},
plugin_config: {
plugins: {
"agent-service": {
defaultAgent: {
endpoint: "<k8s://flyteconnector.flyte:8000>"
}
}
}
},
While now:
flyteconnector: {
enabled: true,
nameOverride: "flyteconnector",
image: {
repository: flyteConnectorRepo,
tag: "latest",
pullPolicy: "Always",
},
serviceAccount: {
annotations: {
'<http://eks.amazonaws.com/role-arn|eks.amazonaws.com/role-arn>': serviceRoleArn
}
},
resources: {
requests: {
cpu: '1',
memory: '2Gi',
},
limits: {
cpu: '1',
memory: '2Gi',
},
},
replicaCount: 2,
plugin_config: {
plugins: {
"connector-service": {
defaultConnector: {
endpoint: "<k8s://flyteconnector.flyte-staging:8000>"
}
}
}
}
},
a
If you are using cache=True in your task, could you try with cache=False and see if it works
From the logs it appears that the IsCachable field is getting a null value
This is the newer version i guess 1.16 , could be some code change issue as well
g
I'm pretty sure it's relating to the connector himself, although I can see the new agent I created. For example: sr/local/lib/python3.12/site-packages/airflow/metrics/base_stats_logger.py:22 RemovedInAirflow3Warning: Timer and timing metrics publish in seconds were deprecated. It is enabled by default from Airflow 3 onwards. Enable timer_unit_consistency to publish all the timer and timing metrics in milliseconds. /usr/local/lib/python3.12/site-packages/airflow/triggers/base.py:27 RemovedInAirflow3Warning: Timer and timing metrics publish in seconds were deprecated. It is enabled by default from Airflow 3 onwards. Enable timer_unit_consistency to publish all the timer and timing metrics in milliseconds. {"name": "asyncio", "msg": "Using selector: EpollSelector", "taskName": null, "time": "2025-09-30T123618Z", "hostname": "flyteconnector-7b5799757b-8gwtt", "level": 20, "pid": 1, "v": 0} πŸš€ Starting the agent service... Starting up the server to expose the prometheus metrics... Agent Metadata ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓ ┃ Agent Name ┃ Support Task Types ┃ Is Sync ┃ ┑━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩ β”‚ Sensor β”‚ sensor (0) β”‚ False β”‚ β”‚ Airflow Connector β”‚ airflow (0) β”‚ False β”‚ β”‚ Boto Connector β”‚ boto (0) β”‚ True β”‚ β”‚ SageMaker Endpoint Connector β”‚ sagemaker-endpoint (0) β”‚ False β”‚ β”‚ Bigquery Connector β”‚ bigquery_query_job_task (0) β”‚ False β”‚ β”‚ K8s DataService Async Connector β”‚ dataservicetask (0) β”‚ False β”‚ β”‚ OpenAI Batch Endpoint Connector β”‚ openai-batch (0) β”‚ False β”‚ β”‚ ChatGPT Connector β”‚ chatgpt (0) β”‚ True β”‚ β”‚ Slurm Function Connector β”‚ slurm_fn (0) β”‚ False β”‚ β”‚ Slurm Script Connector β”‚ slurm (0) β”‚ False β”‚ β”‚ Snowflake Connector β”‚ snowflake (0) β”‚ False β”‚ β”‚ Snowflake Caraml Connector β”‚ snowflake-caraml (0) β”‚ False β”‚ β”‚ Webhook Connector β”‚ webhook (0) β”‚ True β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ {"name": "grpc._cython.cygrpc", "msg": "[_cygrpc] Loaded running loop: id(loop)=140033678870144", "taskName": "Task-1", "time": "2025-09-30T123618Z", "hostname": "flyteconnector-7b5799757b-8gwtt", "level": 20, "pid": 1, "v": 0} {"name": "grpc._cython.cygrpc", "msg": "Using AsyncIOEngine.POLLER as I/O engine", "taskName": "Task-1", "time": "2025-09-30T123618Z", "hostname": "flyteconnector-7b5799757b-8gwtt", "level": 20, "pid": 1, "v": 0} {"name": "grpc._cython.cygrpc", "msg": "[_cygrpc] Loaded running loop: id(loop)=140033678870144", "taskName": "Task-1", "time": "2025-09-30T123618Z", "hostname": "flyteconnector-7b5799757b-8gwtt", "level": 20, "pid": 1, "v": 0}
a
you cannot use
flyteagent
in 1.16?
Copy code
flyteconnector: {
                    enabled: true,
                    nameOverride: "flyteconnector",
I am not sure about this, if this looks right?
g
From the values.yaml this is the correct way to define flyteconnector
a
Ya..i see that now. It was changed in 1.16.0 onwards
Here is the URL fro the changelog which has the changes from agent to connector. https://github.com/flyteorg/flyte/pull/6400/files#diff-4e6486e1da78f83d1007495745cc4bb5a27de1d2664d652564ab193d6634a81f See if it helps
I will also have a look and will post if i find anything
g
Thx a lot, in my opinion maybe it's something about the new agent I added. In the new flyte it stopped to working, I tried to change the propller image to 1.15.1 but it still not working
Yes!!! I finally succeeded to resolve the issue, I left the plugin name to be agent-service instead of connector-service. Now it's working πŸ™‚
c
I’ll take a look and see if the code can be made more robust.
There appears to be a bug in the plugin resolution code, possibly introduced when the connector code landed
It looks like the issue can happen when the flyte connector or flyte agent plugins aren't explicitly enabled here: https://github.com/flyteorg/flyte/blob/master/charts/flyte-core/values.yaml#L1114-L1132
f
cc @glamorous-carpet-83516 and @echoing-account-76888 fyi
c
I have a fix up but don’t have the cycles to write unit tests rn: https://github.com/flyteorg/flyte/pull/6644
❀️ 2
g
@clean-glass-36808 the change looks good to me, thanks!
e
LGTM too. Thanks!