Radhakrishna Sanka
04/13/2023, 4:40 PMDavid Espejo (he/him)
04/13/2023, 5:56 PMRadhakrishna Sanka
04/13/2023, 10:31 PMYour current user or role does not have access to Kubernetes objects on this EKS nodegroup
David Espejo (he/him)
04/16/2023, 6:48 PMThe instances failed yo join the kubernetes cluster
typically happens when the worker nodes cannot communicate with the EKS control plane. It largely depends on the API Server access configuration (eg Public, Public+Private, Private). If you're using private subnets you probably need endpoint services. Check out this runbook that can help you isolate the root cause of the problem: https://docs.aws.amazon.com/systems-manager-automation-runbooks/latest/userguide/automation-awssupport-troubleshooteksworkernode.htmlRadhakrishna Sanka
04/16/2023, 10:50 PMflyte-system
would it be possible to share the JSON for the role ? I want to compare against the role I’m making hereDavid Espejo (he/him)
04/17/2023, 5:33 PM{
"Role": {
"Path": "/",
"RoleName": "flyte-system-role",
"RoleId": "AROAYS5I3UDGD6RDWHN5M",
"Arn": "arn:aws:iam::590375264460:role/flyte-system-role",
"CreateDate": "2023-04-17T21:14:53+00:00",
"AssumeRolePolicyDocument": {
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::590375264460:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/1EE94FBE2DE77558404404CF5947470C"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"<http://oidc.eks.us-east-1.amazonaws.com/id/1EE94FBE2DE77558404404CF5947470C:aud|oidc.eks.us-east-1.amazonaws.com/id/1EE94FBE2DE77558404404CF5947470C:aud>": "<http://sts.amazonaws.com|sts.amazonaws.com>",
"<http://oidc.eks.us-east-1.amazonaws.com/id/1EE94FBE2DE77558404404CF5947470C:sub|oidc.eks.us-east-1.amazonaws.com/id/1EE94FBE2DE77558404404CF5947470C:sub>": "system:serviceaccount:flyte:flyte-backend-binary"
}
}
}
]
},
"Description": "",
"MaxSessionDuration": 3600,
"Tags": [
{
"Key": "<http://alpha.eksctl.io/cluster-name|alpha.eksctl.io/cluster-name>",
"Value": "fthw-eks-cluster"
},
{
"Key": "<http://eksctl.cluster.k8s.io/v1alpha1/cluster-name|eksctl.cluster.k8s.io/v1alpha1/cluster-name>",
"Value": "fthw-eks-cluster"
},
{
"Key": "<http://alpha.eksctl.io/iamserviceaccount-name|alpha.eksctl.io/iamserviceaccount-name>",
"Value": "flyte/flyte-backend-binary"
},
{
"Key": "<http://alpha.eksctl.io/eksctl-version|alpha.eksctl.io/eksctl-version>",
"Value": "0.132.0-dev+15bffbb0d.2023-03-01T18:34:36Z"
}
],
"RoleLastUsed": {}
}
}eksctl
following the instructions in the guide (which I just updated to improve the experience)Radhakrishna Sanka
04/17/2023, 9:23 PMDavid Espejo (he/him)
04/17/2023, 9:27 PMRadhakrishna Sanka
04/17/2023, 9:27 PMDavid Espejo (he/him)
04/17/2023, 9:29 PMRadhakrishna Sanka
04/17/2023, 9:29 PMDavid Espejo (he/him)
04/17/2023, 9:39 PMRadhakrishna Sanka
04/17/2023, 10:32 PMDavid Espejo (he/him)
04/18/2023, 3:21 PMRadhakrishna Sanka
04/18/2023, 3:22 PMDavid Espejo (he/him)
04/18/2023, 3:27 PMdefault
in the same VPC as the EKS cluster. the node-to-control plane is only for the worker nodes to the API serverRadhakrishna Sanka
04/18/2023, 3:29 PMDavid Espejo (he/him)
04/18/2023, 3:31 PMcontrol plane
I mean EKS control plane 🙂 so in this case, `flyteadmin`which will be a workload running on the worker nodes, is the one that needs communication with RDSRadhakrishna Sanka
04/18/2023, 3:36 PMflyte-system-role
? It seemed like it should since the flyte-system role, etc. where used for getting the roles S3AccessDavid Espejo (he/him)
04/18/2023, 3:38 PMRadhakrishna Sanka
04/25/2023, 4:52 PMsql: could not translate host name "<http://flyte-db5b39eef.cfk9fd2rmdtl.us-east-1.rds.amazonaws.com|flyte-db5b39eef.cfk9fd2rmdtl.us-east-1.rds.amazonaws.com>" to address: Temporary failure in name resolution
pod "pgsql-postgresql-client" deleted
pod testdb/pgsql-postgresql-client terminated (Error)
David Espejo (he/him)
04/25/2023, 5:10 PMRadhakrishna Sanka
04/25/2023, 5:12 PMDavid Espejo (he/him)
04/25/2023, 5:14 PMRadhakrishna Sanka
04/25/2023, 6:37 PMkubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
aws-node-26snb 1/1 Running 0 134m
aws-node-b5vsr 1/1 Running 0 133m
aws-node-dx522 1/1 Running 0 133m
aws-node-ssgss 1/1 Running 0 133m
aws-node-xwp5n 1/1 Running 0 134m
coredns-7975d6fb9b-cnwqf 0/1 Pending 0 138m
coredns-7975d6fb9b-xw8qw 0/1 Pending 0 138m
kube-proxy-5v9cw 1/1 Running 0 134m
kube-proxy-9r8fv 1/1 Running 0 134m
kube-proxy-9z4c5 1/1 Running 0 133m
kube-proxy-cbfff 1/1 Running 0 133m
kube-proxy-gxhcj 1/1 Running 0 133m
It shows that my coredns pods are pending ? Does that mean they didn’t start. ? is there any way to force it to start ?David Espejo (he/him)
04/25/2023, 6:45 PMkubectl describe core-dns-... -n kube-system
see the Events
sectionRadhakrishna Sanka
04/25/2023, 7:34 PMerror: the server doesn't have a resource type "coredns-7975d6fb9b-cnwqf"
David Espejo (he/him)
04/25/2023, 8:09 PMkubectl describe pod core-dns... -n kube-system
Radhakrishna Sanka
04/25/2023, 9:01 PMDavid Espejo (he/him)
04/25/2023, 10:07 PMRadhakrishna Sanka
04/25/2023, 10:09 PMDavid Espejo (he/him)
05/04/2023, 5:14 PM