https://flyte.org logo
#ask-the-community
Title
# ask-the-community
f

Fabio Grätz

10/19/2022, 8:57 AM
Hey everyone 🙂 I'm currently deploying a flyte sandbox at my new company and am struggling with getting auth to work. I followed this guide. Flyte is deployed, I am redirected to the SSO login screen, after that I can view the flyte console 👍 I can also run
flytectl get projects
, am redirected to the browser, and then retrieve the projects.
pyflyte register ...
works as well. So far so good. When I manually launch a registered workflow from the console, it doesn't start however.
k -n development describe <http://flyteworkflows.flyte.lyft.com|flyteworkflows.flyte.lyft.com> ah9ghn272q7v22dbsvvd
gives:
Copy code
Status:
  Failed Attempts:  9
  Message:          Workflow[] failed. ErrorRecordingError: failed to publish event, caused by: EventSinkError: Error sending event, caused by [rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: tls: first record does not look like a TLS handshake"]
  Phase:            0
Tasks:
  resource_type:TASK project:"3dod" domain:"development" name:"project.workflows.workflow.deploy_model" version:"2"
Details about config in 🧵 A pointer in the right direction would be much appreciated.
I see similar logs in flytepropeller:
Copy code
E1019 08:50:19.766619       1 workers.go:102] error syncing 'development/arzsbvcrdhd8jdbp2vdd': Workflow[] failed. ErrorRecordingError: failed to publish event, caused by: EventSinkError: Error sending event, caused by [rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: tls: first record does not look like a TLS handshake"]
E1019 08:50:19.776908       1 workers.go:102] error syncing 'development/aprk6hb8gxdcj882njpf': Workflow[] failed. ErrorRecordingError: failed to publish event, caused by: EventSinkError: Error sending event, caused by [rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: tls: first record does not look like a TLS handshake"]
{"json":{"exec_id":"ah9ghn272q7v22dbsvvd","ns":"development","res_ver":"1249185","routine":"worker-37","wf":"3dod:development:project.workflows.workflow.pipeline"},"level":"warning","msg":"Event recording failed. Error [EventSinkError: Error sending event, caused by [rpc error: code = Unavailable desc = connection error: desc = \"transport: authentication handshake failed: tls: first record does not look like a TLS handshake\"]]","ts":"2022-10-19T08:50:49Z"}
This is the flyte admin base configmap:
Copy code
apiVersion: v1
data:
  cluster_resources.yaml: |
    cluster_resources:
      customData:
      - production:
        - projectQuotaCpu:
            value: "5"
        - projectQuotaMemory:
            value: 4000Mi
        - gsa:
            value: gsa-production@<project>.<http://iam.gserviceaccount.com|iam.gserviceaccount.com>
      - staging:
        - projectQuotaCpu:
            value: "2"
        - projectQuotaMemory:
            value: 3000Mi
        - gsa:
            value: gsa-staging@<project>.<http://iam.gserviceaccount.com|iam.gserviceaccount.com>
      - development:
        - projectQuotaCpu:
            value: "2"
        - projectQuotaMemory:
            value: 3000Mi
        - gsa:
            value: gsa-development@<project>.<http://iam.gserviceaccount.com|iam.gserviceaccount.com>
      refreshInterval: 5m
      standaloneDeployment: false
      templatePath: /etc/flyte/clusterresource/templates
  db.yaml: |
    database:
      dbname: flyteadmin
      host: '10.22.0.3'
      passwordPath: /etc/db/pass.txt
      port: 5432
      username: flyteadmin
  domain.yaml: |
    domains:
    - id: development
      name: development
    - id: staging
      name: staging
    - id: production
      name: production
  namespace_config.yaml: |
    namespace_mapping:
      template: '{{ domain }}'
  remoteData.yaml: |
    remoteData:
      scheme: gcs
      signedUrls:
        durationMinutes: 3
  server.yaml: |
    auth:
      appAuth:
        thirdPartyConfig:
          flyteClient:
            clientId: flytectl
            redirectUri: <http://localhost:53593/callback>
            scopes:
            - offline
            - all
      authorizedUris:
      - <https://flyte.company.com>
      - <https://localhost:30081>
      - <http://flyteadmin:80>
      - <http://flyteadmin.flyte.svc.cluster.local:80>
      userAuth:
        openId:
          baseUrl: <https://accounts.google.com>
          clientId: <client id>
          scopes:
          - profile
          - openid
    flyteadmin:
      eventVersion: 2
      metadataStoragePrefix:
      - metadata
      - admin
      metricsScope: 'flyte:'
      profilerPort: 10254
      roleNameKey: <http://iam.amazonaws.com/role|iam.amazonaws.com/role>
      testing:
        host: <http://flyteadmin>
    server:
      grpcPort: 8089
      httpPort: 8088
      security:
        allowCors: true
        allowedHeaders:
        - Content-Type
        allowedOrigins:
        - '*'
        secure: false
        useAuth: true
  storage.yaml: |
    storage:
      type: stow
      stow:
        kind: google
        config:
          json: ""
          project_id: <project>
          scopes: <https://www.googleapis.com/auth/devstorage.read_write>
      container: "company-flyte-bucket"
      enable-multicontainer: false
      limits:
        maxDownloadMBs: 10
  task_resource_defaults.yaml: |
    task_resources:
      defaults:
        cpu: 500m
        memory: 500Mi
        storage: 500Mi
      limits:
        cpu: 2
        gpu: 1
        memory: 1Gi
        storage: 2000Mi
kind: ConfigMap
metadata:
  annotations:
    <http://meta.helm.sh/release-name|meta.helm.sh/release-name>: flyte
    <http://meta.helm.sh/release-namespace|meta.helm.sh/release-namespace>: flyte
  creationTimestamp: "2022-10-18T18:09:45Z"
  labels:
    <http://app.kubernetes.io/instance|app.kubernetes.io/instance>: flyte
    <http://app.kubernetes.io/managed-by|app.kubernetes.io/managed-by>: Helm
    <http://app.kubernetes.io/name|app.kubernetes.io/name>: flyteadmin
    <http://helm.sh/chart|helm.sh/chart>: flyte-core-v1.2.0
  name: flyte-admin-base-config
  namespace: flyte
  resourceVersion: "780983"
  uid: b5da3e3e-35d2-406b-953e-ad568179a646
s

Sören Brunk

10/19/2022, 9:55 AM
It looks like flytepropeller auth against flyteadmin fails. Do you use the embedded auth server?
f

Fabio Grätz

10/19/2022, 9:55 AM
Yes that is the plan. Is it active by default?
Copy code
apiVersion: v1
data:
  admin.yaml: |
    admin:
      clientId: 'flytepropeller'
      clientSecretLocation: /etc/secrets/client_secret
      endpoint: flyteadmin:81
      insecure: false
    event:
      capacity: 1000
      rate: 500
      type: admin
  cache.yaml: |
    cache:
      max_size_mbs: 1024
      target_gc_percent: 70
  catalog.yaml: |
    catalog-cache:
      endpoint: datacatalog:89
      insecure: true
      type: datacatalog
  copilot.yaml: |
    plugins:
      k8s:
        co-pilot:
          image: <http://cr.flyte.org/flyteorg/flytecopilot-release:v1.2.0|cr.flyte.org/flyteorg/flytecopilot-release:v1.2.0>
          name: flyte-copilot-
          start-timeout: 30s
  core.yaml: |
    manager:
      pod-application: flytepropeller
      pod-template-container-name: flytepropeller
      pod-template-name: flytepropeller-template
    propeller:
      downstream-eval-duration: 30s
      enable-admin-launcher: true
      gc-interval: 12h
      kube-client-config:
        burst: 25
        qps: 100
        timeout: 30s
      leader-election:
        enabled: true
        lease-duration: 15s
        lock-config-map:
          name: propeller-leader
          namespace: flyte
        renew-deadline: 10s
        retry-period: 2s
      limit-namespace: all
      max-workflow-retries: 50
      metadata-prefix: metadata/propeller
      metrics-prefix: flyte
      prof-port: 10254
      queue:
        batch-size: -1
        batching-interval: 2s
        queue:
          base-delay: 5s
          capacity: 1000
          max-delay: 120s
          rate: 100
          type: maxof
        sub-queue:
          capacity: 1000
          rate: 100
          type: bucket
        type: batch
      rawoutput-prefix: <gs://company-flyte-bucket/>
      workers: 40
      workflow-reeval-duration: 30s
    webhook:
      certDir: /etc/webhook/certs
      serviceName: flyte-pod-webhook
  enabled_plugins.yaml: |
    tasks:
      task-plugins:
        default-for-task-types:
          container: container
          container_array: k8s-array
          sidecar: sidecar
        enabled-plugins:
        - container
        - sidecar
        - k8s-array
  k8s.yaml: |
    plugins:
      k8s:
        default-cpus: 100m
        default-env-vars: []
        default-memory: 100Mi
  resource_manager.yaml: |
    propeller:
      resourcemanager:
        type: noop
  storage.yaml: |
    storage:
      type: stow
      stow:
        kind: google
        config:
          json: ""
          project_id: <project>
          scopes: <https://www.googleapis.com/auth/devstorage.read_write>
      container: "company-flyte-bucket"
      enable-multicontainer: false
      limits:
        maxDownloadMBs: 10
  task_logs.yaml: |
    plugins:
      k8s-array:
        logs:
          config:
            stackdriver-enabled: true
            stackdriver-logresourcename: k8s_container
      logs:
        cloudwatch-enabled: false
        kubernetes-enabled: false
        stackdriver-enabled: true
        stackdriver-logresourcename: k8s_container
kind: ConfigMap
metadata:
  annotations:
    <http://meta.helm.sh/release-name|meta.helm.sh/release-name>: flyte
    <http://meta.helm.sh/release-namespace|meta.helm.sh/release-namespace>: flyte
  creationTimestamp: "2022-10-18T18:09:47Z"
  labels:
    <http://app.kubernetes.io/instance|app.kubernetes.io/instance>: flyte
    <http://app.kubernetes.io/managed-by|app.kubernetes.io/managed-by>: Helm
    <http://app.kubernetes.io/name|app.kubernetes.io/name>: flyteadmin
    <http://helm.sh/chart|helm.sh/chart>: flyte-core-v1.2.0
  name: flyte-propeller-config
  namespace: flyte
  resourceVersion: "1245505"
  uid: 4dd30035-f278-452c-8910-cc9bd4878c6b
s

Sören Brunk

10/19/2022, 10:02 AM
In previous versions, it had a default secret set, which caused a security issue because essentially anyone could connect with the
flytepropeller
client id using that default secret. I think that was fixed so you might have to set that explicitly now.
And yes I think it is active by default if you set
useAuth: true
and don't explicitly configure an external auth server.
f

Fabio Grätz

10/19/2022, 10:08 AM
In the GCP values I took from the docs and slightly adapted with my values, this section is missing. Do you know whether it is related to the default secret you mentioned?
s

Sören Brunk

10/19/2022, 10:10 AM
Yes exactly, that's the one.
But if you change that here, the change only applies to flytepropeller.
f

Fabio Grätz

10/19/2022, 10:12 AM
I need to ask another dumb question, sorry ^^ Do I but a random made up secret there that both admin and propeller will then know or is this the same client id and secret from my oauth provider that is also used to configure the login screen?
s

Sören Brunk

10/19/2022, 10:12 AM
Flyteadmin does not use that config and still expects whatever is default in it's config (used to be
foobar
as well, hence the security issue).
So in my case, I had to override that part of the flyteadmin config and that is not really documented yet unfortunately, let me see if I can find it
Regarding your question: this is different from the client id you configure in Google.
So you should put a random secret in there. It is a shared secret for the internal oauth flow which is different than the browser login via google.
Google login does not support an oauth client credentials flow that's why we can't use that here for the internal flytepropeller auth...
f

Fabio Grätz

10/19/2022, 10:21 AM
Ok got it, will add the following to the helm values:
Copy code
secrets:
  adminOauthClientCredentials:
    enabled: true
    clientSecret: <random string>
    clientId: flytepropeller
So this affects the communication between propeller and admin right? Do you know how auth works between the other flyte services? Console and admin work via the Google login I understand.
s

Sören Brunk

10/19/2022, 10:27 AM
Right. I'm not sure how i.e. flyteadmin talks to datacatalog but (given working DB auth) the communication between propeller and admin was the only thing that ever caused issues for us.
f

Fabio Grätz

10/19/2022, 10:29 AM
I added the secret section to the helm values but still get:
Copy code
Status:
  Failed Attempts:  3
  Message:          Workflow[] failed. ErrorRecordingError: failed to publish event, caused by: EventSinkError: Error sending event, caused by [rpc error: code = Unauthenticated desc = transport: per-RPC creds failed due to error: oauth2: cannot fetch token: 401 Unauthorized
Response: {"error":"invalid_client","error_description":"Client authentication failed (e.g., unknown client, no client authentication included, or unsupported authentication method)."}]
  Phase:  0
Tasks:
But I also didn't actively do anything to make admin aware of this secret.
s

Sören Brunk

10/19/2022, 10:29 AM
I found the relevant section to overwrite the default flyteadmin auth server config via helm:
Copy code
configmap:
    adminServer:
        server:
            httpPort: 8088
            grpcPort: 8089
            dataProxy:
                upload:
                    storagePrefix: upload
            security:
                secure: false
                useAuth: true
                allowCors: true
                allowedOrigins:
                    # Accepting all domains for Sandbox installation
                    - '*'
                allowedHeaders:
                    - Content-Type
        auth:
            appAuth:
                authServerType: Self
                selfAuthServer:
                    accessTokenLifespan: 30m0s
                    authorizationCodeLifespan: 5m0s
                    claimSymmetricEncryptionKeySecretName: claim_symmetric_key
                    issuer: ""
                    oldTokenSigningRSAKeySecretName: token_rsa_key_old.pem
                    refreshTokenLifespan: 1h0m0s
                    staticClients:
                        flyte-cli:
                            audience: null
                            grant_types:
                                - refresh_token
                                - authorization_code
                            id: flyte-cli
                            public: true
                            redirect_uris:
                                - <http://localhost:53593/callback>
                                - <http://localhost:12345/callback>
                            response_types:
                                - code
                                - token
                            scopes:
                                - all
                                - offline
                                - access_token
                        flytectl:
                            audience: null
                            grant_types:
                                - refresh_token
                                - authorization_code
                            id: flytectl
                            public: true
                            redirect_uris:
                                - <http://localhost:53593/callback>
                                - <http://localhost:12345/callback>
                            response_types:
                                - code
                                - token
                            scopes:
                                - all
                                - offline
                                - access_token
                        flytepropeller:
                            audience: null
                            client_secret: <your client secret hashed and base64 encoded>
                            grant_types:
                                - refresh_token
                                - client_credentials
                            id: flytepropeller
                            public: false
                            redirect_uris:
                                - <http://localhost:3846/callback>
                            response_types:
                                - token
                            scopes:
                                - all
                                - offline
                                - access_token
flyterpropeller.client_secret is the relevant change. Unfortunately, it requires the secret as a base64 encoded hash. I'll try to find how we did that hashing.
f

Fabio Grätz

10/19/2022, 11:02 AM
So I ran
echo -n "clientSecret" | base64
which is what I would typically do for a k8s secret.
Still doesn't work 😅 Flytescheduler is failing now:
k -n flyte logs -f flytescheduler-6cbfb47f9f-h76q4 -c flytescheduler-check
Copy code
--storage.limits.maxDownloadMBs int                                          Maximum allowed download size (in MBs) per call. (default 2)
      --storage.stow.config stringToString                                         Configuration for stow backend. Refer to github/flyteorg/stow (default [])
      --storage.stow.kind string                                                   Kind of Stow backend to use. Refer to github/flyteorg/stow
      --storage.type string                                                        Sets the type of storage to configure [s3/minio/local/mem/stow]. (default "s3")

panic: rpc error: code = Unauthenticated desc = transport: per-RPC creds failed due to error: oauth2: cannot fetch token: 401 Unauthorized
Response: {"error":"invalid_client","error_description":"Client authentication failed (e.g., unknown client, no client authentication included, or unsupported authentication method)."}

goroutine 1 [running]:
main.main()
	/go/src/github.com/flyteorg/flyteadmin/cmd/scheduler/main.go:12 +0x85
And my worklfow still has the status:
Copy code
Status:
  Failed Attempts:  4
  Message:          Workflow[] failed. ErrorRecordingError: failed to publish event, caused by: EventSinkError: Error sending event, caused by [rpc error: code = Unauthenticated desc = transport: per-RPC creds failed due to error: oauth2: cannot fetch token: 401 Unauthorized
Response: {"error":"invalid_client","error_description":"Client authentication failed (e.g., unknown client, no client authentication included, or unsupported authentication method)."}]
  Phase:  0
s

Sören Brunk

10/19/2022, 12:09 PM
Yeah it also needs to be hashed in a specific way. I'll try to find how we did it.
k

Ketan (kumare3)

10/19/2022, 1:28 PM
This should be documented- I think it is. Cc @Yee / @Eduardo Apolinario (eapolinario) can you point how to create the client secret for Google - https://docs.flyte.org/en/latest/deployment/cluster_config/auth_setup.html
f

Fabio Grätz

10/19/2022, 3:26 PM
@Eduardo Apolinario (eapolinario) I think the Google IdP part is working correctly. I can log into the console with SSO and can also register workflows with flytectl or pyflyte. I think the problem happens with the communication between admin, propeller, and scheduler when I want to run a workflow. My understanding is that there is a 2nd internal authorization server for this which is not related to google at all, right?
s

Sören Brunk

10/19/2022, 4:11 PM
So @Fabio Grätz so I think I had to run it through bcrypt:
Copy code
pip install bcrypt
python
>>> import bcrypt
>>> bcrypt.hashpw(b"foobar", bcrypt.gensalt(prefix=b"2a"))
The resulting hash should look something like this:
b'$2a$12$d3mGDJwq9F5TiQA1YYm0TOVzvEvcBX5VEw2AW0gqrn7Mvh2InuiCS'
then base64 encode it and use that as client_secret in the config.
I hope I remember it correctly, I should have documented the steps properly.
Perhaps someone from the flyte team familiar with the internal auth server can verify if this is the right way or not.
Here we need the matching secret still in cleartext, i.e.
Copy code
secrets:
  adminOauthClientCredentials:
    enabled: true
    clientSecret: foobar
    clientId: flytepropeller
f

Fabio Grätz

10/19/2022, 4:17 PM
b'$2a$12$d3mGDJwq9F5TiQA1YYm0TOVzvEvcBX5VEw2AW0gqrn7Mvh2InuiCS'
Excluding the
b'...'
I suppose?
```secrets:
adminOauthClientCredentials:
enabled: true
clientSecret: foobar
clientId: flytepropeller```
Yes, I have this 🙂
s

Sören Brunk

10/19/2022, 4:18 PM
Yes sorry that's from python
f

Fabio Grätz

10/19/2022, 4:18 PM
Thanks a ton for looking this up @Sören Brunk!!
y

Yee

10/19/2022, 4:18 PM
Copy code
python -c 'import bcrypt; import base64; print(base64.b64encode(bcrypt.hashpw("mypassword".encode("utf-8"), bcrypt.gensalt(6))))'
f

Fabio Grätz

10/19/2022, 4:23 PM
OMG it's working 🥳
Thanks a lot guys for the help, I would have never figured this one out xD
@Yee Are you interested in a PR to the docs where I add this?
s

Sören Brunk

10/19/2022, 4:24 PM
Awesome!
f

Fabio Grätz

10/19/2022, 4:38 PM
I have one more question @Sören Brunk @Yee: My understanding is that the ingress exposes routes to flyteconsole and flyteadmin so they are public in the internet. To secure those, we integrate e.g. google oauth. Isn't the communication between admin, scheduler, and propeller purely internal to the cluster? So in theory would it be fair to say that the authorization between those and admin using the built-in mechanism is less important than the first one for admin and console? Or are scheduler, propeller, ... available from the internet as well through a mechanism I'm currently missing?
y

Yee

10/19/2022, 4:38 PM
sorry afk
@Fabio Grätz yes!
to the PR i mean
this was ours
Copy code
appAuth:
  thirdPartyConfig:
    flyteClient:
      clientId: flytectl
      redirectUri: <http://localhost:53593/callback>
      scopes:
      - offline
      - all
  selfAuthServer:
    staticClients:
      flyte-cli: 
        id: "flyte-cli"
        redirect_uris:
          - "<http://localhost:53593/callback>"
          - "<http://localhost:12345/callback>"
        grant_types:
          - refresh_token
          - authorization_code
        response_types:
          - code
          - token
        scopes:
          - all
          - offline
          - access_token 
        public: true
      flytectl: 
        id: flytectl
        redirect_uris:
          - "<http://localhost:53593/callback>"
          - "<http://localhost:12345/callback>"
        grant_types:
          - refresh_token
          - authorization_code
        response_types:
          - code
          - token
        scopes:
          - all
          - offline
          - access_token 
        public: true
      flytepropeller: 
        id: flytepropeller
        client_secret: JDJhJDA2JGd3N0pNUno1OXpCSzFk43DJkYlUxTHV2MGxRMlFWHNlTkczcElyU3V1TzhZai95ODJsQ2dh
        redirect_uris:
          - "<http://localhost:3846/callback>"
        grant_types:
          - refresh_token
          - client_credentials
        response_types:
          - token
        scopes:
          - all
          - offline
          - access_token
and your understanding is correct, that traffic is internal to your aws account but it doesn’t have to be internal to the cluster.
you can run admin/propeller on different clusters. at lyft we ran admin on the company wide services cluster and ran a bunch of compute clusters with propeller running in each
but there’s no simple/secure way to say oh this is internal traffic, it doesn’t need auth.
at least not that i’m aware of.
f

Fabio Grätz

10/19/2022, 4:44 PM
Istio mTLS to at least not having to manage it oneself ^^
But yes, fair point. I just wanted to make sure my understanding is correct that this 2nd auth mechanism isn't for any traffic publicly in the internet.
If it is traffic within our companies VPC I'm a little bit less worried about it ^^
y

Yee

10/19/2022, 4:45 PM
propeller/scheduler? no
that’s all internal traffic that is correct
f

Fabio Grätz

10/19/2022, 4:45 PM
So only admin to propeller traffic is traffic that could span multiple clusters?
y

Yee

10/19/2022, 4:48 PM
mmm.
admin doesn’t talk to propeller directly
in the multi-cluster setup, admin talks to the kube api running on the cluster
and propeller sends event information back up to admin
f

Fabio Grätz

10/19/2022, 4:49 PM
Ah and this secret we encoded with
base64.b64encode(bcrypt.hashpw...
is used by propeller to authorize itself to admin?
y

Yee

10/19/2022, 4:50 PM
propeller/admin talks to data catalog.
yes
that is correct.
f

Fabio Grätz

10/19/2022, 4:50 PM
Cool
Thanks a lot
y

Yee

10/19/2022, 4:50 PM
this is for google idp. for people using okta this can be different
f

Fabio Grätz

10/19/2022, 4:50 PM
Wait, then I still didn't understand it ^^
Isn't the authorization between propeller and admin unrelated to google?
y

Yee

10/19/2022, 4:51 PM
it is… but it can be related to okta
f

Fabio Grätz

10/19/2022, 4:51 PM
Ok
So for google we use the built in authorization mechanism between propeller and admin and the google idp one for accessing console and admin from the internet.
s

Sören Brunk

10/19/2022, 4:52 PM
It's needed because google IDP does not support oauth client credentials flow
y

Yee

10/19/2022, 4:52 PM
(just thought I’d mention in case you write docs and it becomes relevant. google idp does not come with an authorization server)
f

Fabio Grätz

10/19/2022, 4:53 PM
Yes, I'll write docs and describe the things I needed to figure out in order to make it work after going through the docs. Will tag you to correct it then ^^
k

Ketan (kumare3)

10/19/2022, 4:57 PM
❤️ @Fabio Grätz
h

Haytham Abuelfutuh

10/19/2022, 11:14 PM
Yes @Fabio Grätz your understanding is spot on! FYI Admin will need to change a bit to accommodate non-authenticated internal requests though as it currently enforces auth on all requests or no auth at all..
f

Fabio Grätz

10/20/2022, 4:47 PM
@Haytham Abuelfutuh @Ketan (kumare3) Due to security policies in my new company I won't be able to deploy flyte like I now got it to work following the GCP + Auth guides where Flyte serves the google login page itself unfortunately 😕 Instead of the nginx ingress I will have to use the GKE native ingress controller and configure Google Identity Aware Proxy at the load balancer level so that one first hits the google login screen, then the app behind it. The flyte GKE docs mention google managed certificates and by modifying the ingress annotations in the helm values I think I can get this to work. I already tried this yesterday but for the GKE Ingress the flytedmin backend was always unhealthy so the ingress never came up. I think I might have to adapt the liveness probe of the deployment but I think i can also get that to work... My question now is: Assuming that when I go to https://flyte.my-company.com and first hit Google Identity Aware Proxy, log in using my company SSO, and then reach the flyte UI: 1. Will flyte be aware who I am from Google Idp? 2. And more importantly, will the authentication mechanism of flytectl/pyflyte with the callback still work? :S I really hope the last point is not a blocker 🙈
a

Andrew Korzhuev

10/24/2022, 12:52 PM
@Sören Brunk I was having the same issue, which this thread helped to resolve. Shouldn't the clientSecret in
selfAuthServer.staticClients
be passed through automatically? Since it is available plaintext in
adminOauthClientCredentials
anyway?
k

Ketan (kumare3)

10/24/2022, 1:21 PM
@Fabio Grätz sorry for the delayed response- I think @Nicholas LoFaso and team use Google IDP?
f

Fabio Grätz

10/24/2022, 4:05 PM
No prob 👍 For now I did the following which allows me to continue with the PoC: • Use GKE ingress with identity aware proxy at load balancer level to comply with internal security guidelines. Since the GKE ingress can't handle flyteadmins
exec.command
liveness probe and the ingress never comes up, I route all traffic from the ingress into an istio service mesh and via that to flyteconsole/flyteadmin. • Make flyteadmin's gRPC endpoint available via an internal load balancer only in the VPC. So engineers can register tasks, ... only from their remote development machines or via CICD workers in the VPC. But that is not too much of a limitation for us since we use remote dev machines all the time...
j

Justin Tyberg

10/24/2022, 5:18 PM
@Fabio Grätz Glad you got things working. I work with Nick, and we have Flyte running on GCP as well. This thread sums up pretty much our path with Flyte+GCP auth. FWIW, we’re still investigating ways to streamline our auth setup.
Are you using Istio on GKE? I ask because Google is deprecating that too, in favor of Anthos.
f

Fabio Grätz

10/25/2022, 8:05 AM
Hey @Justin Tyberg, I don't use the Istio that comes with GKE when you tick the respective box but install the OOS version of istio with helm... Had good experience with this. Would you be interested in a short call? Maybe there is knowledge we can share to streamline our setups
j

Justin Tyberg

10/26/2022, 11:00 AM
@Fabio Grätz definitely interested in syncing up and comparing notes. maybe something next week?
f

Fabio Grätz

10/27/2022, 7:51 AM
@Justin Tyberg yes, next week sounds great. Which time zone are you in?
j

Justin Tyberg

10/27/2022, 2:37 PM
hey @Fabio Grätz. I’m in EST, which is why there’s a lag in my replies to you posts 😉
d

David Espejo

11/08/2022, 2:24 AM
So, is it still true that for auth to work with Google login in a sandbox environment, an encoded random secret has to be added to config so
flytepropeller
can perform auth against the K8s API instance, and that this requirement is not documented yet?
k

Ketan (kumare3)

11/08/2022, 4:53 AM
that is correct, not sure if it is not documented. I thought it was documented?
h

Haytham Abuelfutuh

11/08/2022, 4:56 AM
This is true (Unfortunately): Ref for why it’s removed (urgently) https://github.com/flyteorg/flyteadmin/security/advisories/GHSA-67x4-qr35-qvrm @Yee can you share instructions for how to do that?
Oh I see Yee has shared the instructions in this thread
k

Ketan (kumare3)

11/08/2022, 4:56 AM
@David Espejo would you be open to contributing a doc addition?
d

David Espejo

11/08/2022, 12:48 PM
Thank you @Haytham Abuelfutuh! Yes @Ketan (kumare3) I can give it a try 🙂
The PR is here for your review 🙂 https://github.com/flyteorg/flyte/pull/3062
f

Fabio Grätz

02/20/2023, 5:19 PM
@Ena Škopelja this thread
k

Ketan (kumare3)

02/20/2023, 5:25 PM
Also cc @David Espejo (he/him) maybe we should merge this
275 Views