Does anybody ever encountered datacatalog pod hang...
# ask-the-community
z
Does anybody ever encountered datacatalog pod hanging in the init container stage? It seems my pod stuck in the waiting status forever. Yet once I deleted the pod init container, the pod is up. Besides I’m wondering what does the datacatalog migrate actually do, would it be ok to skip this step if no db changes occurs?
d
I think you understand correctly. The datacatalog migrate step updates the DB schema if necessary, adding / removing fields and / or tables. Can you provide a little more information on what is stuck? Can you get logs from the init container? dump the Pod yaml from k8s?
k
Hmm stuck usually would mean unable to reach the database
Logs would really help
z
I didn’t find any log anywhere as the container is not even up. K8s event says it’s waiting for image pulling.
I am pretty sure the registry is set up properly as once I removed the datacatalog migrate step, the pod is back to normal
And the logs are the following:
Copy code
time="2022-11-14T13:11:41Z" level=info msg="Using config file: [/etc/datacatalog/config/db.yaml /etc/datacatalog/config/logger.yaml /etc/datacatalog/config/server.yaml /etc/datacatalog/config/storage.yaml]"
{"json":{"src":"viper.go:398"},"level":"debug","msg":"Config section [database] updated. No update handler registered.","ts":"2022-11-14T13:11:41Z"}
{"json":{"src":"viper.go:398"},"level":"debug","msg":"Config section [datacatalog] updated. No update handler registered.","ts":"2022-11-14T13:11:41Z"}
{"json":{"src":"viper.go:398"},"level":"debug","msg":"Config section [application] updated. No update handler registered.","ts":"2022-11-14T13:11:41Z"}
{"json":{"src":"viper.go:398"},"level":"debug","msg":"Config section [storage] updated. No update handler registered.","ts":"2022-11-14T13:11:41Z"}
{"json":{"src":"service.go:160"},"level":"info","msg":"Serving DataCatalog http on port :8080","ts":"2022-11-14T13:11:41Z"}
{"json":{"src":"stow_store.go:396"},"level":"warning","msg":"stow configuration section missing, defaulting to legacy s3/minio connection config","ts":"2022-11-14T13:11:41Z"}
{"json":{"app_name":"datacatalog","src":"service.go:92"},"level":"info","msg":"Created data storage.","ts":"2022-11-14T13:11:46Z"}
{"json":{"app_name":"datacatalog","src":"service.go:103"},"level":"info","msg":"Created DB connection.","ts":"2022-11-14T13:11:46Z"}
{"json":{"src":"server.go:96"},"level":"info","msg":"Starting profiling server on port [10254]","ts":"2022-11-14T13:11:46Z"}
{"json":{"src":"service.go:132"},"level":"info","msg":"Serving DataCatalog Insecure on port :8089","ts":"2022-11-14T13:11:46Z"}
It looks pretty normal
d
what is the image for the init container? it doesn't match the image for regular container?
z
it’s the same, I deployed it from the flyte helm chart.
d
Oh OK, then it doesn't make sense that k8s would be waiting for image pulling.
z
ya. I guess the image was pulled yet it stuck in migrating
Does the migration have anything to do with the object store?
d
it should not. it just creates the database and correct schema to store the information datacatalog requires. This is the first time you have datacatalog running then right? Like you're setting up Flyte?
Can you confirm that the postgres instance that is defined in the datacatalog configuration is running and accessible?
z
ya,first time. I tried to re-deploy the datacatalog couple of times. Neither of those worked.
I am using an RDS instance, it looks pretty healthy
Flyteadmin was pretty ok with the db as I was able to run various flytectl get commands to check the metadata at the same time.
d
OK, if that's the case then I suspect there is some issue with the datacatalog DB connection code. I'm looking through it a little bit here.
Do you mind dumping the db configuration from both flyteadmin and datacatalog? please replcace any sensitive fields (hostnames, etc) with random placeholders.
z
Sure. I am pretty sure I used the same set up for those two though as they are configured in the same place in the helm charts.
It’s in the cm right ?
d
should be. i'll be honest, i'm not intimately familiar with the helm charts. maybe i should take a look there quick 😄
z
db.yaml: | database: dbname: datacatalog host: pgm-* passwordPath: /etc/db/pass.txt port: 5432 username: flyteadmin db.yaml: | database: dbname: flyteadmin host: pgm-* passwordPath: /etc/db/pass.txt port: 5432 username: flyteadmin
it’s basically the same
d
Alright, I'm having some difficulty repro-ing this. If there is an problem with the DB connection it should error out and not just hang. Can you provide the
InitContainers
section from
kubectl describe pod <datacatalog pod
like this? And then can we try to check the init container logs here ?
103 Views