Hello, I'm stuck with running flyte-binary using A...
# flyte-deployment
a
Hello, I'm stuck with running flyte-binary using Azure AD OIDC authentification, as soon as I enable auth, the pod fails with this error : {"json":{"src":"service.go:326"},"level":"error","msg":"Error creating auth context [AUTH_CONTEXT_SETUP_FAILED] Error creating oidc provider w/ issuer [https://login.microsoftonline.com/my-tenant-id/oauth2/v2.0/authorize], caused by: 404 Not Found: ","ts":"2023-11-04T115005Z"} 2023-11-04T115005.283020253Z {"json": for sure something is missing, I carefully followed the documentation, I read about issues with go-oidc, but I can't find any hint to fix the problem on flyte side, should something not said in the documentation is missing ? or something to change in the application configuration in Azure AD ? thanks in advance for help
s
cc @David Espejo (he/him)
j
How does your
oidc.baseUrl
look like? Mine works fine using the flyte core chart: “https://login.microsoftonline.com/<tenant_id>/v2.0”
a
There is '/authorize' after 'v2.0'.... you mean I should remove anything after v2.0 in the baseUrl ?
j
Yes - give it a try!
a
Of course, thanks !
j
Let me know if it worked 🙂
a
Yes, the flyte is starting OK, now, but still an issue with redirect, ingress return a 302 with callback url, to reach console
j
Nice! One step further 👍
Did you set Redirect URL of the azure app registration to
https://<your deployment url>/callback
? Did you set
authorizedUris
to
https://<your deployment url
?
a
normally yes, I'll check with the admin, and yes for authorizedUris !
yes for the app registration config ! wandering where the issue come from ;O)
j
What do you use for ingress?
a
I think the ingress is OK, it's nginx, but I forgot some auth configuration in inline part of the value file... I try to add the missing part in the configuration....
it didn't fix the problem, but I found Errors in flyte logs:
{"json":{"src":"cookie.go:73","x-request-id":"b5464e761df7a54ae9cede42406760ca"},"level":"info","msg":"Could not detect existing cookie [flyte_idt]. Error: http: named cookie not present","ts":"2023-11-06T17:29:52Z"}
2023-11-06T17:29:52.707198507Z {"json":{"src":"handlers.go:281"},"level":"info","msg":"Failed to parse Access Token from context. Will attempt to find IDToken. Error: [JWT_VERIFICATION_FAILED] Could not retrieve bearer token from metadata, caused by: rpc error: code = Unauthenticated desc = Request unauthenticated with Bearer","ts":"2023-11-06T17:29:52Z"}
Mon, Nov 6 2023 6:29:52 pm
{"json":{"src":"cookie.go:73","x-request-id":"38885e1e27d3e429e3b3c52313f3b7ea"},"level":"info","msg":"Could not detect existing cookie [flyte_idt]. Error: http: named cookie not present","ts":"2023-11-06T17:29:52Z"}
2023-11-06T17:29:52.817706244Z {"json":{"src":"cookie.go:73","x-request-id":"81756e5b47421ce83712cb1ea6d2376f"},"level":"info","msg":"Could not detect existing cookie [flyte_idt]. Error: http: named cookie not present","ts":"2023-11-06T17:29:52Z"}
2023-11-06T17:29:52.817714006Z {"json":{"src":"handlers.go:281"},"level":"info","msg":"Failed to parse Access Token from context. Will attempt to find IDToken. Error: [JWT_VERIFICATION_FAILED] Could not retrieve bearer token from metadata, caused by: rpc error: code = Unauthenticated desc = Request unauthenticated with Bearer","ts":"2023-11-06T17:29:52Z"}
2023-11-06T17:29:52.817719127Z {"json":{"src":"handlers.go:281"},"level":"info","msg":"Failed to parse Access Token from context. Will attempt to find IDToken. Error: [JWT_VERIFICATION_FAILED] Could not retrieve bearer token from metadata, caused by: rpc error: code = Unauthenticated desc = Request unauthenticated with Bearer","ts":"2023-11-06T17:29:52Z"}
Mon, Nov 6 2023 6:29:53 pm
{"json":{"src":"cookie.go:73"},"level":"info","msg":"Could not detect existing cookie [flyte_idt]. Error: http: named cookie not present","ts":"2023-11-06T17:29:53Z"}
2023-11-06T17:29:53.073640388Z {"json":{"src":"handlers.go:85"},"level":"error","msg":"Failed to retrieve tokens from request, redirecting to login handler. Error: [EMPTY_OAUTH_TOKEN] Failure to retrieve cookie [flyte_idt], caused by: http: named cookie not present","ts":"2023-11-06T17:29:53Z"}
each time I try to access flyte console...
any idea @Jan Fiedler ?
j
Haven’t seen this error myself sry.
But it looks like ingress is fine and the azure app registration gets requested
a
No issue, I continue to dig the problem, but for sure the documentation has to be enhanced, on the subject of AAD auth (now Entra)...
j
Absolutely! Im curious about your findings - let me know 😉
a
Hello, I progressed, by adding annotations to the ingress such as : nginx.ingress.kubernetes.io/affinity: "cookie" nginx.ingress.kubernetes.io/session-cookie-name: "flyte_idt" nginx.ingress.kubernetes.io/session-cookie-secure: "true" now, the (or a) cookie is transmitted through the ingress, but I have different errors: {"json":{"src":"cookie.go:104","x-request-id":"e611a90260d1a665170143e5a40e917e"},"level":"error","msg":"Error reading secure cookie flyte_idt securecookie: base64 decode failed - caused by: illegal base64 data at input byte 10","ts":"2023-11-07T133222Z"} 2023-11-07T133222.935353510Z {"json":{"src":"cookie.go:85","x-request-id":"e611a90260d1a665170143e5a40e917e"},"level":"error","msg":"Error reading existing secure cookie [flyte_idt]. Error: [SECURE_COOKIE_ERROR] Error reading secure cookie flyte_idt, caused by: securecookie: base64 decode failed - caused by: illegal base64 data at input byte 10","ts":"2023-11-07T133222Z"} the cookie is not the original one...
j
Hm interesting. I am using alb for ingress on aws so not sure about these annotations. On what cloud are you running flyte? is alb an option for you?
d
@Alain GALDEMAS is Flyte registered on Azure using any custom claim?
a
Hi, @David Espejo (he/him): There's no custom claim configured in Azure AD, the process of login to AAD works fine, the problem rise when the browser request the redirect uri "<flyte-url>/callback?code=...", which receive a 502 Bad gateway from the ingress nginx, because of the error on cookie !
d
@Alain GALDEMAS can you check if there's any interesting message/error emmited by the nginx controller Pod?
a
@David Espejo (he/him), Bingo : upstream sent too big header while reading response header from upstream, client: 89.3.143.206, server: flyte.dev.eridanis.fr, request: "GET /callback?...
so it's the ingress... thanks, for the advice ! I've change the ingress annotations, to add nginx.ingress.kubernetes.io/proxy-buffer-size: "32k" instead of the cookie stuff... Now, it goes a step beyond, but it loops asking to connect again and again, even authenticated with azure and my own account, when accessing to the flyte console 😞 let's have another look to logs...
d
it loops asking to connect again and again,
so it prompts you to login to AAD?
a
yes, any api request remains unauthorized (401) !
d
I've seen users increasing buffer-size to `128k`but not sure it could change the behavior here
a
the same issue remains with cookie/jwt but for each api request :
2023-11-08 10:41:48{"json":{"src":"handlers.go:85"},"level":"error","msg":"Failed to retrieve tokens from request, redirecting to login handler. Error: [EMPTY_OAUTH_TOKEN] Error reading existing secure cookie [flyte_idt]. Error: [SECURE_COOKIE_ERROR] Error reading secure cookie flyte_idt, caused by: securecookie: base64 decode failed - caused by: illegal base64 data at input byte 10","ts":"2023-11-08T09:41:48Z"}
2023-11-08 10:41:48{"json":{"src":"cookie.go:85"},"level":"error","msg":"Error reading existing secure cookie [flyte_idt]. Error: [SECURE_COOKIE_ERROR] Error reading secure cookie flyte_idt, caused by: securecookie: base64 decode failed - caused by: illegal base64 data at input byte 10","ts":"2023-11-08T09:41:48Z"}
2023-11-08 10:41:48{"json":{"src":"cookie.go:104"},"level":"error","msg":"Error reading secure cookie flyte_idt securecookie: base64 decode failed - caused by: illegal base64 data at input byte 10","ts":"2023-11-08T09:41:48Z"}
2023-11-08 10:41:48{"json":{"src":"token.go:100"},"level":"debug","msg":"Could not retrieve id token from metadata rpc error: code = Unauthenticated desc = Request unauthenticated with IDToken","ts":"2023-11-08T09:41:48Z"}
2023-11-08 10:41:48{"json":{"src":"handlers.go:281"},"level":"info","msg":"Failed to parse Access Token from context. Will attempt to find IDToken. Error: [JWT_VERIFICATION_FAILED] Could not retrieve bearer token from metadata, caused by: rpc error: code = Unauthenticated desc = Request unauthenticated with Bearer","ts":"2023-11-08T09:41:48Z"}
2023-11-08 10:41:48{"json":{"src":"token.go:80"},"level":"debug","msg":"Could not retrieve bearer token from metadata rpc error: code = Unauthenticated desc = Request unauthenticated with Bearer","ts":"2023-11-08T09:41:48Z"}
"Error reading secure cookie flyte_idt, caused by: securecookie: base64 decode failed - caused by: illegal base64 data at input byte 10"
I removed cookies, and refreshed the page, stopped loading of console, and reload again, and now it stay connected ???, so strange....
I tried with a private browser window, and it's now OK now, with those ingress annotations : nginx.ingress.kubernetes.io/proxy-buffer-size: "132k" nginx.ingress.kubernetes.io/session-cookie-secure: "true" the baseUrl is now https://login.microsoftonline.com/tenant id/v2.0 thanks for your help @Jan Fiedler, @David Espejo (he/him) May be an update on the documentation about Azure AD (Entra now), is welcome... let you promote this
d
@Alain GALDEMAS thanks for confirming, so in summary, to update the docs: 1. Make sure that even the comments point to the right baseURL format for Entra (
<https://login.microsoftonline.com/><tenant id>/v2.0
) 2. Add a note, in case you're using NGINX Ingress Controllers, add those two notes Is that correct? Something else to add?
a
Hello, for the part of accessing the flyte console with Entra,I think it's OK.
But my nightmare is not over, since now I can't manage to interact with flyte-binary with flytectl and pyflyte, even after updating the local configuration file with flytectl config, pyflyte run and flytectl version give an error :
flytectl version
{"json":{"src":"viper.go:400"},"level":"debug","msg":"Config section [admin] updated. Firing updated event.","ts":"2023-11-09T12:48:02+01:00"}
{"json":{"src":"viper.go:398"},"level":"debug","msg":"Config section [storage] updated. No update handler registered.","ts":"2023-11-09T12:48:02+01:00"}
{"json":{"src":"viper.go:398"},"level":"debug","msg":"Config section [root] updated. No update handler registered.","ts":"2023-11-09T12:48:02+01:00"}
{"json":{"src":"viper.go:398"},"level":"debug","msg":"Config section [files] updated. No update handler registered.","ts":"2023-11-09T12:48:02+01:00"}
{"json":{"src":"viper.go:398"},"level":"debug","msg":"Config section [console] updated. No update handler registered.","ts":"2023-11-09T12:48:02+01:00"}
{"json":{"src":"client.go:63"},"level":"info","msg":"Initialized Admin client","ts":"2023-11-09T12:48:02+01:00"}
{
"App": "flytectl",
"Build": "1350bfa",
"Version": "0.8.0",
"BuildTime": "2023-11-09 12:48:02.154309 +0100 CET m=+0.027722611"
}{"json":{"src":"auth_interceptor.go:86"},"level":"debug","msg":"Request failed due to [rpc error: code = Unknown desc = unexpected HTTP status code received from server: 0 (); malformed header: missing HTTP content-type]. If it's an unauthenticated error, we will attempt to establish an authenticated context.","ts":"2023-11-09T12:48:02+01:00"}
{"json":{"src":"version.go:103"},"level":"debug","msg":"Failed to get version of control plane rpc error: code = Unknown desc = unexpected HTTP status code received from server: 0 (); malformed header: missing HTTP content-type: \n","ts":"2023-11-09T12:48:02+01:00"}
{"json":{"src":"version.go:81"},"level":"debug","msg":"rpc error: code = Unknown desc = unexpected HTTP status code received from server: 0 (); malformed header: missing HTTP content-type","ts":"2023-11-09T12:48:02+01:00"}
I'm digging on the issue, and come back, I suspect an ingress configuration problem with grpc/http2 If someone have an idea on how to fix this issue, you'll be more than welcome
when I perform a pyflyte run, it produce the following error : pyflyte run --remote workflows/hello-world.py hello_world_wf Failed with Exception Code: SYSTEM:Unknown RPC Failed, with Status: StatusCode.UNKNOWN details: Received http2 header with status: 0 Debug string UNKNOWN:Error received from peer ipv4:ip:443 {grpc_message:"Received http2 header with status: 0", grpc_status:2, created_time:"2023-11-09T133037.758122+01:00"}
Also must be added the nginx ingress annotation for grpc, such as : grpcAnnotations: nginx.ingress.kubernetes.io/backend-protocol: "GRPC" then flytectl or pyflyte, can ask for authentication... now it still remains an issue AAD return an error : Couldn't get access token due to error: oauth2: cannot fetch token: 401 Unauthorized Response: {"error":"invalid_client","error_description":"AADSTS7000218: The request body must contain the following parameter: 'client_assertion' or 'client_secret' Still, help is welcome....
Something is still missing in the inline configuration...
d
@Alain GALDEMAS the most recent set of annotations that's proven to work well with flyte-binary is this:
Copy code
ingress:
   create: true
   ingressClassName: nginx
   host: "<your-flyte-FQDN>"
   commonAnnotations:
     <http://nginx.ingress.kubernetes.io/ssl-redirect|nginx.ingress.kubernetes.io/ssl-redirect>: "true"
   httpAnnotations:
    <http://nginx.ingress.kubernetes.io/app-root|nginx.ingress.kubernetes.io/app-root>: /console
   grpcAnnotations:
     <http://nginx.ingress.kubernetes.io/backend-protocol|nginx.ingress.kubernetes.io/backend-protocol>: "GRPC"
also, please make sure that your auth config includes what's indicated in this section: https://docs.flyte.org/en/latest/deployment/configuration/auth_setup.html#apply-oidc-configuration
a
only "nginx.ingress.kubernetes.io/ssl-redirect: "true" in commonAnnotattions ? I'm OK to add "nginx.ingress.kubernetes.io/app-root: /console" to httpAnnotations, but I can already connect to the console !
d
cool, so I guess what remains now is to find out why your client_secret is not discovered. Your auth section should look something like this:
Copy code
auth:
 enabled: true
 oidc:
   # baseUrl: <https://accounts.google.com> # Uncomment for Google
   # baseUrl: https://<keycloak-url>/auth/realms/<keycloak-realm> # Uncomment for Keycloak and update with your installation host and realm name
   # baseUrl: <https://login.microsoftonline.com/><tenant-id>/v2.0/ # Uncomment for Azure AD
   # For Okta use the Issuer URI from Okta's default auth server
   baseUrl: <https://dev>-<org-id>.<http://okta.com/oauth2/default|okta.com/oauth2/default>
   # Replace with the client ID and secret created for Flyte in your IdP
   clientId: <client_ID>
   clientSecret: <client_secret>
 internal:
   clientSecret: '<your-random-password>'
   # Use the output of step #2 (only the content inside of '')

   clientSecretHash: <your-hashed-password>

 authorizedUris:
 - https://<your-flyte-deployment-URL>
With the internal.ClientSecretHash being the output of the
bcrypt
command described in the docs. Is all of that in place?
a
yes for this part ! but I have also filled the inline part such as :
Copy code
inline:
    auth:
      appAuth:
        authServerType: External
        externalAuthServer:
        # baseUrl: <https://login.microsoftonline.com/><tenant-id>/oauth2/v2.0/authorize # Uncomment for Azure AD
        # with the above uri including /oauth2/v2.0/authorize flyte does not start
          baseUrl: <https://login.microsoftonline.com/><tenant-id>/v2.0
          metadataUrl: .well-known/openid-configuration
          AllowedAudience:
            - api://<client-id>

        thirdPartyConfig:
          flyteClient:
            # Use the clientID generated by your IdP for the `flytectl` app registration
            clientId: <client-id>
            redirectUri: <http://localhost:53593/callback>
            scopes:
              - profile
              - openid
              - email
              - offline_access
      userAuth:
        openId:
        # baseUrl: https://<keycloak-url>/auth/realms/<keycloak-realm> # Uncomment for Keycloak and update with your installation host and realm name
        # baseUrl: <https://login.microsoftonline.com/><tenant-id>/oauth2/v2.0/authorize # Uncomment for Azure AD, but bad idea flyte can't start !!!
        # For Okta, use the Issuer URI of the custom auth server:
          baseUrl: <https://login.microsoftonline.com/90138f><tenant-id>/v2.0
          scopes:
            - profile
            - openid
            - email
            - offline_access
        # - offline_access # Uncomment if your IdP supports issuing refresh tokens (optional)
        # Use the client ID and secret generated by your IdP for the first OIDC registration in the "Identity Management layer : OIDC" section of this guide
          clientId: <client-id>
I saw the influence of this part for using flytectl and pyflyte, if I use non-existing scopes, AAD return an AADSTS650053 error, but I don't catch what to add in this part to solve the remaining error AADSTS7000218...
in particular can someone explain the usage of "appAuth" and "userAuth" part in "inline.auth" item, in the value file, it's not so clear ? To my understanding, it's for the access by clients (like flytectl and pyflyte, are), am I right ?
d
userAuth
is where you define the config for OIDC (the Identity layer on top of the auth completed using OAuth2.0). It refers mainly to the authentication that's completed when the user provides a secret be it using the browser or some other flow. `appAuth`controls the auth flow for clilents Recommended resources are: This thread: https://flyte-org.slack.com/archives/CP2HDHKE1/p1695844976239729?thread_ts=1695387759.103999&amp;cid=CP2HDHKE1 Understanding auth in Flyte: https://docs.flyte.org/en/latest/deployment/configuration/auth_appendix.html Hope it helps.
Also, please refrain from sending to channel, as long as it's not completely necessary (see Slack guidelines here). Thanks! 😊
a
sorry !
OK, I tried to untangle the imbroglio (in my head), how can we summarize ? if AADSTS7000218, request a missing client_secret, How to set it in local (on macos), to be used by flytectl version and pyflyte ? is it by issuing a flytectl config command ? I tried to add clientSecret, as the client_secret, to include in the request, after clientId, in inline.auth.userAuth, but no way, the AADSTS7000218 error is still there...
d
so when using the External auth server, you should have 3 apps registered on AAD: `flytectl`(this one also works for
pyflyte
),
flytepropeller
and
flyteconsole
Credentials should go: •
flytectl
on
inline.auth.thirdPartyConfig
flytepropeller
on
auth.internal
flyteconsole
on
auth.oidc
and
inline.userAuth
Unfortunately I haven't tried this with AAD and flyte-binary, but this is the setup I've used with Okta
a
Hi, OK, you tell me, that we need to have 3 different apps registered on AAD, so far I read, this is not clearly explained in the documentation, anyway I'll see with my admin to add it ! But what about the configuration in AAD for
flytectl
and
flytepropeller
? is it the same as the one described in the documentation? or is there different
redirectUri
for each ? to my understanding, the one for
flyteconsole
should have only redirectUri to flyte FDQN, (<my-flyte-deployment-URL>), OK for tthis, as it already works, the one for
flytectl
should have
redirectUri: <http://localhost:53593/callback>
, right ? but what
redirectUri
to set for the
flytepropeller
app ?
d
as indicated in this section of the docs, the config should be: 1. Both
flyteconsole
and
flytepropeller
should be registered as Web Applications, using
https://<console-url>/callback
as callback URI 2. `flytectl`using
<http://localhost:53593/callback>
as redirect URI
a
Hello,Thanks @David Espejo (he/him), yes I respect the documentation, but keeping only one app configured in AAD, whatever we tried, even by changing in App AAD config, adding secret in
inline.auth.thirdPartyConfig
,
inline.userAuth
, we are stuck with this AADSTS7000218 error... Then I'll ask to our AAD admin to add separated applications registration, to see if it change the AAD behaviour, I'll come back, in this thread For the moment, as I need to go forward, I'll try to use keycloak as the external auth server, which we also use for many projects.
d
@Alain GALDEMAS ok, sorry for the struggles. Keep us posted to see how we can help
a
Hello, I managed to get my flyte-binary secured with keycloak, it was a bit hard to find the right configurations for keycloak and flyte... Once again, like for AAD, the documentation have to be enhanced, to include information from the useful issue: "[Docs] Additional Keycloak configuration settings", but more clearly explained, I can help if you like, as now I have a true working example ;O)
d
@Alain GALDEMAS thanks for sharing. Would you be up to opening a PR?
a
OK, I have to clone the repo on my personal github, first, stay tuned ;O)