Hi All, I am seeing this error when I am trying to...
# ask-the-community
a
Hi All, I am seeing this error when I am trying to run flyte on k8s. What am I missing?
Copy code
{"json":{},"level":"warning","msg":"Failed to create cluster resources for namespace [flytesnacks-development] with err: Failed to read config template dir [flytesnacks-development] for namespace [] with err: open : no such file or directory","ts":"2023-05-24T09:40:15Z"}
{"json":{},"level":"warning","msg":"Failed to create cluster resources for namespace [flytesnacks-staging] with err: Failed to read config template dir [flytesnacks-staging] for namespace [] with err: open : no such file or directory","ts":"2023-05-24T09:40:15Z"}
{"json":{},"level":"warning","msg":"Failed to create cluster resources for namespace [flytesnacks-production] with err: Failed to read config template dir [flytesnacks-production] for namespace [] with err: open : no such file or directory","ts":"2023-05-24T09:40:15Z"}
{"json":{},"level":"warning","msg":"Failed cluster resource creation loop with: Failed to read config template dir [flytesnacks-development] for namespace [] with err: open : no such file or directory, Failed to read config template dir [flytesnacks-staging] for namespace [] with err: open : no such file or directory, Failed to read config template dir [flytesnacks-production] for namespace [] with err: open : no such file or directory","ts":"2023-05-24T09:40:15Z"}
{"json":{},"level":"error","msg":"Failed to initialize certificates for Secrets Webhook. client rate limiter Wait returned an error: context canceled","ts":"2023-05-24T09:40:20Z"}
{"json":{},"level":"panic","msg":"Failed to start Propeller, err: failed to create FlyteWorkflow CRD: <http://customresourcedefinitions.apiextensions.k8s.io|customresourcedefinitions.apiextensions.k8s.io> is forbidden: User \"system:serviceaccount:test-apps:test-flyte-role\" cannot create resource \"customresourcedefinitions\" in API group \"<http://apiextensions.k8s.io|apiextensions.k8s.io>\" at the cluster scope","ts":"2023-05-24T09:40:20Z"}
s
@Ketan (kumare3) @Samhita Alla any help in this regard is highly appreciated.
k
Permissions seem to be wrong. How did you deploy
a
We deployed single binary on Kubernetes. Using Service account role.
d
@Abhinay Dronavally I've seen this error a couple of times, especially on EKS environments when there are mismatches in IRSA config What's your K8s environment?
a
We are running Kubernetes on EKS cluster, with service account config like this
Copy code
serviceAccount:
  # Specifies whether a service account should be created
  create: true
  # Annotations to add to the service account
  annotations: {}
  # The name of the service account to use.
  # If not set and create is true, a name is generated using the fullname template
  name: "test-role"
This role has write access to S3, and EKS cluster.
d
can you
kubectl get sa -n <your-namespace>
and then
kubectl describe sa <service-account-name> -n <your-namespace>
the role should be an IAM role and the annotation should include the full ARN
a
Copy code
Name:                flyte-role
Namespace:           flyte
Labels:              <http://app.kubernetes.io/cluster=flyte-eks|app.kubernetes.io/cluster=flyte-eks>
                     <http://app.kubernetes.io/instance=flyte|app.kubernetes.io/instance=flyte>
                     <http://app.kubernetes.io/managed-by=Helm|app.kubernetes.io/managed-by=Helm>
                     <http://app.kubernetes.io/name=flyte|app.kubernetes.io/name=flyte>
                     <http://app.kubernetes.io/version=1.16.0|app.kubernetes.io/version=1.16.0>
                     <http://helm.sh/chart=flyte-0.1.0|helm.sh/chart=flyte-0.1.0>
Annotations:         <http://eks.amazonaws.com/role-arn|eks.amazonaws.com/role-arn>:  <EKS_ARN>                     <http://eks.amazonaws.com/sts-regional-endpoints|eks.amazonaws.com/sts-regional-endpoints>: true
Image pull secrets:  <none>
Mountable secrets:   <none>
Tokens:              <none>
Events:              <none>
@David Espejo (he/him)
Why is it trying to add group to
<http://flyte.lyft.com|flyte.lyft.com>
?
Copy code
v0.24.1/tools/cache/reflector.go:167: failed to list *v1alpha1.FlyteWorkflow: <http://flyteworkflows.flyte.lyft.com|flyteworkflows.flyte.lyft.com> is forbidden: User "system:serviceaccount:flyte:flyte-role" cannot list resource "flyteworkflows" in API group "<http://flyte.lyft.com|flyte.lyft.com>" at the cluster scope
Copy code
pkg/mod/k8s.io/client-go@v0.24.1/tools/cache/reflector.go:167: failed to list *v1alpha1.FlyteWorkflow: <http://flyteworkflows.flyte.lyft.com|flyteworkflows.flyte.lyft.com> is forbidden: User "system:serviceaccount:flyte:flyte-role" cannot list resource "flyteworkflows" in API group "<http://flyte.lyft.com|flyte.lyft.com>" at the cluster scope
d
flyte.lyft
is the API group for the workflow CRD
can you
aws iam get-role --role-name <YOUR_IAM_ROLE --query Role.AssumeRolePolicyDocument
this document looks to provide a bit more detailed explanations https://github.com/davidmirror-ops/flyte-the-hard-way/blob/main/docs/03-roles-service-accounts.md
368 Views