Hi All, I am seeing this error when I am trying to...
# ask-the-community
Hi All, I am seeing this error when I am trying to run flyte on k8s. What am I missing?
Copy code
{"json":{},"level":"warning","msg":"Failed to create cluster resources for namespace [flytesnacks-development] with err: Failed to read config template dir [flytesnacks-development] for namespace [] with err: open : no such file or directory","ts":"2023-05-24T09:40:15Z"}
{"json":{},"level":"warning","msg":"Failed to create cluster resources for namespace [flytesnacks-staging] with err: Failed to read config template dir [flytesnacks-staging] for namespace [] with err: open : no such file or directory","ts":"2023-05-24T09:40:15Z"}
{"json":{},"level":"warning","msg":"Failed to create cluster resources for namespace [flytesnacks-production] with err: Failed to read config template dir [flytesnacks-production] for namespace [] with err: open : no such file or directory","ts":"2023-05-24T09:40:15Z"}
{"json":{},"level":"warning","msg":"Failed cluster resource creation loop with: Failed to read config template dir [flytesnacks-development] for namespace [] with err: open : no such file or directory, Failed to read config template dir [flytesnacks-staging] for namespace [] with err: open : no such file or directory, Failed to read config template dir [flytesnacks-production] for namespace [] with err: open : no such file or directory","ts":"2023-05-24T09:40:15Z"}
{"json":{},"level":"error","msg":"Failed to initialize certificates for Secrets Webhook. client rate limiter Wait returned an error: context canceled","ts":"2023-05-24T09:40:20Z"}
{"json":{},"level":"panic","msg":"Failed to start Propeller, err: failed to create FlyteWorkflow CRD: <http://customresourcedefinitions.apiextensions.k8s.io|customresourcedefinitions.apiextensions.k8s.io> is forbidden: User \"system:serviceaccount:test-apps:test-flyte-role\" cannot create resource \"customresourcedefinitions\" in API group \"<http://apiextensions.k8s.io|apiextensions.k8s.io>\" at the cluster scope","ts":"2023-05-24T09:40:20Z"}
@Ketan (kumare3) @Samhita Alla any help in this regard is highly appreciated.
Permissions seem to be wrong. How did you deploy
We deployed single binary on Kubernetes. Using Service account role.
@Abhinay Dronavally I've seen this error a couple of times, especially on EKS environments when there are mismatches in IRSA config What's your K8s environment?
We are running Kubernetes on EKS cluster, with service account config like this
Copy code
  # Specifies whether a service account should be created
  create: true
  # Annotations to add to the service account
  annotations: {}
  # The name of the service account to use.
  # If not set and create is true, a name is generated using the fullname template
  name: "test-role"
This role has write access to S3, and EKS cluster.
can you
kubectl get sa -n <your-namespace>
and then
kubectl describe sa <service-account-name> -n <your-namespace>
the role should be an IAM role and the annotation should include the full ARN
Copy code
Name:                flyte-role
Namespace:           flyte
Labels:              <http://app.kubernetes.io/cluster=flyte-eks|app.kubernetes.io/cluster=flyte-eks>
Annotations:         <http://eks.amazonaws.com/role-arn|eks.amazonaws.com/role-arn>:  <EKS_ARN>                     <http://eks.amazonaws.com/sts-regional-endpoints|eks.amazonaws.com/sts-regional-endpoints>: true
Image pull secrets:  <none>
Mountable secrets:   <none>
Tokens:              <none>
Events:              <none>
@David Espejo (he/him)
Why is it trying to add group to
Copy code
v0.24.1/tools/cache/reflector.go:167: failed to list *v1alpha1.FlyteWorkflow: <http://flyteworkflows.flyte.lyft.com|flyteworkflows.flyte.lyft.com> is forbidden: User "system:serviceaccount:flyte:flyte-role" cannot list resource "flyteworkflows" in API group "<http://flyte.lyft.com|flyte.lyft.com>" at the cluster scope
Copy code
pkg/mod/k8s.io/client-go@v0.24.1/tools/cache/reflector.go:167: failed to list *v1alpha1.FlyteWorkflow: <http://flyteworkflows.flyte.lyft.com|flyteworkflows.flyte.lyft.com> is forbidden: User "system:serviceaccount:flyte:flyte-role" cannot list resource "flyteworkflows" in API group "<http://flyte.lyft.com|flyte.lyft.com>" at the cluster scope
is the API group for the workflow CRD
can you
aws iam get-role --role-name <YOUR_IAM_ROLE --query Role.AssumeRolePolicyDocument
this document looks to provide a bit more detailed explanations https://github.com/davidmirror-ops/flyte-the-hard-way/blob/main/docs/03-roles-service-accounts.md