https://flyte.org logo
#ask-the-community
Title
# ask-the-community
a

Ashika UMAGILIYA

09/11/2023, 1:42 AM
Hi ! We are looking in to Flyte for fine-tuning LLMs in our organization. I am trying to setup a simple Flyte cluster on GCP GKE following this tutorials : https://docs.flyte.org/en/latest/deployment/deployment/cloud_simple.html , but I only see helm file for eks-starter.yaml , is there a configuration for GKE ?
ok, i assume i need to use shttps://github.com/flyteorg/flyte/blob/master/charts/flyte-binary/values.yaml and configure relevant settings for GCP ?
k

Ketan (kumare3)

09/11/2023, 4:44 AM
ohh man this is a big miss - cc @David Espejo (he/him) / @jeev i thought we had examples of gke now? There are a few discussions in the past - https://discuss.flyte.org/t/13166006/hi-all-working-on-setting-up-flyte-on-gke-as-part-of-a-side-
Also the channel #flyte-on-gcp cc @Fabio Grätz is the area lead
a

Ashika UMAGILIYA

09/11/2023, 4:45 AM
Thanks
f

Fabio Grätz

09/11/2023, 11:08 AM
I think we should bring back this GCP getting started page which is not part of the latest docs anymore.
@Ashika UMAGILIYA can you please try to follow it regardless?
I suggest that you deploy using Nginx ingress. This means that when you reach the “certificate” section, don’t use the
Copy code
kind: ManagedCertificate
but instead install cert manager which is documented right below.
To configure auth for the nginx ingress on gcp, follow this guide here.
Please let me know if you are stuck at any step, will try to help.
a

Ashika UMAGILIYA

09/12/2023, 5:44 AM
Thank you. I managed to get the cluster up and running
f

Fabio Grätz

09/12/2023, 7:29 AM
Awesome 😄
a

Ashika UMAGILIYA

09/12/2023, 7:45 AM
any idea on how to fix the issue with the CLI ? https://flyte-org.slack.com/archives/CP2HDHKE1/p1694484098954209
f

Fabio Grätz

09/12/2023, 7:52 AM
Haven’t see this exact one. Did you configure a domain and TLS cert for your ingress?
a

Ashika UMAGILIYA

09/12/2023, 7:54 AM
No ingress, first i wanted to try with a simple setup. So I try with port-forwarding.
f

Fabio Grätz

09/12/2023, 7:54 AM
Ah
There is an`insecure` flag in the client config. Set it to true. Then it won’t try to establish a TLS connection.
For port-forwarding, this is ok
~/.flyte/config.yaml
Do you have this file?
a

Ashika UMAGILIYA

09/12/2023, 7:55 AM
yes, this error is after setting that flag to "true" .
does the CLI use gRPC ? If so SSL is must?
f

Fabio Grätz

09/12/2023, 7:56 AM
The CLI does use gRPC but SSL is not a must necessarily for this. I can confirm that I worked with port-forwardning flyteadmin before and there wasn’t any TLS involved.
What is admin.endpoint in your client config? And what your port-forwarding command?
a

Ashika UMAGILIYA

09/12/2023, 7:57 AM
Copy code
admin:
  # For GRPC endpoints you might want to use dns:///flyte.myexample.com
  endpoint: dns:///127.0.0.1:8088
  authType: Pkce
  insecure: true
logger:
  show-source: true
  level: 0
Screen Shot 2023-09-12 at 16.58.07.png
port-forwarding.
f

Fabio Grätz

09/12/2023, 7:59 AM
Just as sanity check, can you put localhost instead of 127… in the client config?
a

Ashika UMAGILIYA

09/12/2023, 7:59 AM
yeah, first i tried that 😞
f

Fabio Grätz

09/12/2023, 8:00 AM
But you put 8088 into the client config.
Is this the http port?
Needs to be the grpc one
a

Ashika UMAGILIYA

09/12/2023, 8:00 AM
oh i see, so its 8089
thanks let me try
f

Fabio Grätz

09/12/2023, 8:01 AM
(I always used flyte-core not binary. In this case I would use port 81 which is the flyteadmin gRPC one)
a

Ashika UMAGILIYA

09/12/2023, 8:03 AM
"flyte-core not binary" >> I only see one pod
Copy code
flyte-backend-flyte-binary-5876c5745b-hhtrd   1/1     Running   0          17h
f

Fabio Grätz

09/12/2023, 8:04 AM
Yes this is correct for flyte binary.
a

Ashika UMAGILIYA

09/12/2023, 8:04 AM
with helm it only started this pod. Is there away to start the "core" pod?
f

Fabio Grätz

09/12/2023, 8:05 AM
There are two helm charts, one called flyte-core -> multiple pods, one called flyte-binary -> one pod. But there is currently no need for you to switch to the other one I’d say.
a

Ashika UMAGILIYA

09/12/2023, 8:05 AM
i see, with 8089 still having the same error
f

Fabio Grätz

09/12/2023, 8:05 AM
Just wanted to say that I’m 100% it needs to be the gRPC for the flyte-core helm chart. I would be very surprised if it was different for the flyte-binary helm chart.
try with localhost again maybe now
a

Ashika UMAGILIYA

09/12/2023, 8:07 AM
no luck 😞
f

Fabio Grätz

09/12/2023, 8:07 AM
also maybe check that port-forwarding is still running. Also you do 8089:8090, maybe check again that this is correct and not the wrong way round
😞
a

Ashika UMAGILIYA

09/12/2023, 8:09 AM
oh, new error. I think CLI to server communication works now ! Seems like some GCS permission issue. Should be able to fix this.
f

Fabio Grätz

09/12/2023, 8:09 AM
Awesome
a

Ashika UMAGILIYA

09/12/2023, 8:09 AM
Screen Shot 2023-09-12 at 17.09.41.png
thanks alot ! let me play around little bit
f

Fabio Grätz

09/12/2023, 8:10 AM
Yeah
communication works
The server needs to create a signed url for blob storage for flytekit to upload the code.
The service account of the server needs the permission to create these signed urls
a

Ashika UMAGILIYA

09/12/2023, 8:11 AM
ok got it. let me add that and try again. Thanks alot
f

Fabio Grätz

09/12/2023, 8:11 AM
Give
"iam.serviceAccounts.signBlob"
to the respective sa and it should work
a

Ashika UMAGILIYA

09/12/2023, 8:52 AM
btw, shouldn't this also SA have access/permissions to dynamically create pods/containers via the k8s api ?( to execute the task logic ) . I didn't see such permissions here : https://docs.flyte.org/en/v1.0.0/deployment/gcp/manual.html#permissions ?
f

Fabio Grätz

09/12/2023, 8:55 AM
I think these come from the k8s service account
Can you do
kubectl -n flyte get sa
?
And than
get sa <name> -o yaml
.
I think there you should see permissions to create pods
a

Ashika UMAGILIYA

09/12/2023, 8:57 AM
Copy code
apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    <http://iam.gke.io/gcp-service-account|iam.gke.io/gcp-service-account>: <mailto:dev01-flyte-poc-iam@fr-stg-datalake-k8s.iam.gserviceaccount.com|dev01-flyte-poc-iam@fr-stg-datalake-k8s.iam.gserviceaccount.com>
    <http://meta.helm.sh/release-name|meta.helm.sh/release-name>: flyte-backend
    <http://meta.helm.sh/release-namespace|meta.helm.sh/release-namespace>: flyte
  creationTimestamp: "2023-09-11T14:31:45Z"
  labels:
    <http://app.kubernetes.io/instance|app.kubernetes.io/instance>: flyte-backend
    <http://app.kubernetes.io/managed-by|app.kubernetes.io/managed-by>: Helm
    <http://app.kubernetes.io/name|app.kubernetes.io/name>: flyte-binary
    <http://app.kubernetes.io/version|app.kubernetes.io/version>: 1.16.0
    <http://helm.sh/chart|helm.sh/chart>: flyte-binary-v1.9.1
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:<http://meta.helm.sh/release-name|meta.helm.sh/release-name>: {}
          f:<http://meta.helm.sh/release-namespace|meta.helm.sh/release-namespace>: {}
        f:labels:
          .: {}
          f:<http://app.kubernetes.io/instance|app.kubernetes.io/instance>: {}
          f:<http://app.kubernetes.io/managed-by|app.kubernetes.io/managed-by>: {}
          f:<http://app.kubernetes.io/name|app.kubernetes.io/name>: {}
          f:<http://app.kubernetes.io/version|app.kubernetes.io/version>: {}
          f:<http://helm.sh/chart|helm.sh/chart>: {}
    manager: helm
    operation: Update
    time: "2023-09-11T14:31:45Z"
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          f:<http://iam.gke.io/gcp-service-account|iam.gke.io/gcp-service-account>: {}
    manager: kubectl-annotate
    operation: Update
    time: "2023-09-11T14:32:02Z"
  name: dev01-flyte-gke-sa
  namespace: flyte
  resourceVersion: "227723"
  uid: f0b5b2bd-b98f-43d8-8f90-bdf0e0ecf66d
f

Fabio Grätz

09/12/2023, 9:02 AM
Sorry, you also need to check role and rolebinding for this service account. In role you see the permission to create pods I guess
kubectl get role <name> -o yaml
a

Ashika UMAGILIYA

09/12/2023, 9:08 AM
ok, btw these are created by the helm right ? Because i dont think i created any k8s sa or rolebindings
f

Fabio Grätz

09/12/2023, 9:08 AM
yeah helm
a

Ashika UMAGILIYA

09/12/2023, 9:11 AM
Screen Shot 2023-09-12 at 18.11.39.png
f

Fabio Grätz

09/12/2023, 9:12 AM
Tbh I wouldn’t worry about this before you notice that you start an execution but actually the pods are not appearing.
a

Ashika UMAGILIYA

09/12/2023, 9:13 AM
yes, actually the task failed.
oh sorry, its again has something to do with GCS permnissions
let me playaround with some permissions.. thank alot for the help
f

Fabio Grätz

09/12/2023, 9:15 AM
Do
kubectl get sa <service account name> -o yaml
There should be a gcp service account mentioned in the annotations.
This one needs permissions to view and create objects.
a

Ashika UMAGILIYA

09/12/2023, 11:07 AM
that SA has all the permissions (GCS Storage Admin). But I just noticed in the UI the sa / iam is shown as "default" .
f

Fabio Grätz

09/12/2023, 11:37 AM
That means the
default
kubernetes service account in the respective namespace the task pod runs in is used.
Can you check whether the default service account has the annotation with the iam workload identity -> the gcp service account it is bound to?
a

Ashika UMAGILIYA

09/12/2023, 11:40 AM
i just bounded "default" account to GCP SA using
Copy code
gcloud iam service-accounts add-iam-policy-binding <mailto:dev01-flyte-poc-iam@fr-stg-datalake-k8s.iam.gserviceaccount.com|dev01-flyte-poc-iam@fr-stg-datalake-k8s.iam.gserviceaccount.com> \
    --role roles/iam.workloadIdentityUser \
    --member "serviceAccount:fr-stg-datalake-k8s.svc.id.goog[flyte/default]"
but why it use "default" sa ? SA is defined in helm values right?
f

Fabio Grätz

09/12/2023, 11:40 AM
For every flyte project or flyte project/domain, there is a namespace in which the task pods run.
Each of these namespaces has a k8s service account called default.
This service account is used for tasks.
It is not the same k8s service account that is used for the flyte backend.
This is correct.
But you also need to annotate the default service account so that it uses the workload identity mapping I think
a

Ashika UMAGILIYA

09/12/2023, 11:45 AM
hmm i see, i assumed this would fix it
Copy code
kubectl annotate serviceaccount default \
    --namespace flyte \
    <http://iam.gke.io/gcp-service-account=dev01-flyte-poc-iam@fr-stg-datalake-k8s.iam.gserviceaccount.com|iam.gke.io/gcp-service-account=dev01-flyte-poc-iam@fr-stg-datalake-k8s.iam.gserviceaccount.com>
where "dev01-flyte-poc-iam" is the GCP IAM Service account
still doesn't work ! oh man ! this is harder than i imagined 😉 I thought by today i would be running a training example on a Ray cluster 😉 i'll try this tomorrow. Thanks again for the help
f

Fabio Grätz

09/12/2023, 11:48 AM
What is the current error message though?
In the UI that it can’t access gcs 403?
a

Ashika UMAGILIYA

09/12/2023, 11:48 AM
yes,, its the same
f

Fabio Grätz

09/12/2023, 11:49 AM
Ok 😕
Well, have a nice evening!
Ping me tomorrow if still stuck
a

Ashika UMAGILIYA

09/12/2023, 11:49 AM
thanks.. ill do a fresh installation tomorrow and try again
finally 🙂 "_Each of these namespaces has a k8s service account called default_." >> this really helped. Wasnt aware of that concept.
d

David Espejo (he/him)

09/12/2023, 8:17 PM
thanks so much @Fabio Grätz sorry for the struggles @Ashika UMAGILIYA. We're working on a GCP reference implementation and we'll make sure to apply the learnings from this thread. Any further question please let us know
f

Fabio Grätz

09/13/2023, 6:49 AM
Awesome it works now 🙂