I am trying to follow the single cluster deploymen...
# ask-the-community
I am trying to follow the single cluster deployment tutorial using EKS, S3 and RDS. My cluster is getting bottlenecked with some problems appear to be caused by the Database. I was just wondering if I have configured my eks-start.yaml file correctly?
Hi @Jay Phan Do you happen to have log outputs or evidence of those problems? The screenshots didn't open
hey jay - could you check access to the db using psql?
if you need to be in the network, maybe just install a pod that runs bare ubuntu and install psql.
you can also kick off this debugging pod we use…
Copy code
apiVersion: v1
kind: Pod
  name: portalpod
  namespace: flyte
  - args:
    - sleep
    - infinity
    image: "<http://ghcr.io/flyteorg/flyteportal:v0.42.0|ghcr.io/flyteorg/flyteportal:v0.42.0>"
    imagePullPolicy: IfNotPresent
    name: portalcontainer
        cpu: 500m
        memory: 500Mi
        cpu: 500m
        memory: 500Mi
  restartPolicy: Never
  serviceAccount: default
once you’re connected to psql hopefully you can sleuth around there?
i can open the screenshot, but it’s pretty limited, doesn’t show too much
maybe there are more performance statistics on the other rds pages.
pg_stat_activity shows currently running queries, but it’s been years since i’ve looked into pg performance.
look at admin logs if you can also on the flyte side
and yeah to david’s point, why do you suspect the db?
@Yee which policy should I allow for the AWS IAM role?
remind me again which field this is for?
like which value are you filling in in the chart
Its for the use cases section when creating a new role, options are : EC2, Lambda, EKS,...
The problem I was facing originally is for Flyte binary pod, it has been pending for days and the arguement is that: • until pg_isready \ -h database-2.currqrb44ezt.us-east-2.rds.amazonaws.com \ -p 5432 \ -U postgres do echo waiting for database sleep 0.1 done I also can not connect my db to psql
so you’re looking for a role to fill in here?
looking at what we are using internally it’s pretty complicated. can’t really paste it.
looking at the custom policies we have, the s3 access is the key part.
however, i don’t think this is necessarily related to your database issue.
we can debug that separately
how are you trying to hit the database from psql?
i’m assuming you’re using rds with user name and password, not with iam role
but maybe check security group, make sure it’s in the same region, etc?