boundless-pizza-95864
12/20/2021, 5:19 PME1220 16:10:36.391892 1 workers.go:102] error syncing 'mandant1-development/f3359d6b5cd941830000': Workflow[] failed. ErrorRecordingError: failed to publish event, caused by: EventSinkError: Error sending event, caused by [rpc error: code = Unauthenticated desc = token parse error [JWT_VERIFICATION_FAILED] Could not retrieve id token from metadata, caused by: rpc error: code = Unauthenticated desc = Request unauthenticated with IDToken]
It leaves the system in a weird state because there's no hard failure. flytepropeller ist running after all so our monitoring does not trigger an alarm, but no workflow execution is happening. After a flytepropeller restart, it suddenly starts working again.
So after some digging I found out why this happens in our setup: flyte-secret-auth
is populated with .Values.secrets.adminOauthClientCredentials.clientSecret
during installation which is set to the placeholder foobar
in values.yaml
. We set the secret value dynamically though with a helm hook during installation because we need to fetch the real client-secret from Keycloak. That happens only after flytepropeller is deployed and it seems that flytepropeller does not reload the secret on changes. Since flyte-secret-auth
is managed by helm, this happens again on every helm upgrade
.
I see mainly two (non exclusive) ways to improve this behavior:
⢠Remove the default clientSecret
and only create flyte-secret-auth
via helm if the value is actually set. Only mount flyte-secret-auth
if external auth is enabled. That would cause flytepropeller to fail to start until flyte-secret-auth
is created by other means.
⢠Trigger a flytepropeller reload when flyte-secret-auth
changes.
Any thoughts on this? Happy to contribute here but I'd like to discuss the best way forward with you first.high-park-82026
boundless-pizza-95864
12/22/2021, 3:26 PMenabled
option for adminOauthClientCredentials:
https://github.com/flyteorg/flyte/pull/1976boundless-pizza-95864
12/22/2021, 3:49 PM