Blake Jackson11/13/2023, 7:15 PM
and are seeing flyteadmin CPU hover around 99% consistently. We've increased the CPU multiple times, each time resulting in the same behavior. Currently, we have the CPU set to
, but what's strange is that we were running before(
) on less
. Does anyone have any thoughts as to what could be causing this? Should we look into anything in particular to performance tune? Is there a recommended resources config someone could point me to?
NAME CPU(cores) MEMORY(bytes) flyteadmin-c75645575-qprvl 3229m 421Mi flyteadmin-c75645575-xcq8b 2m 67Mi
Blake Jackson11/14/2023, 2:17 AM
Eduardo Apolinario (eapolinario)11/14/2023, 3:22 AM
Blake Jackson11/14/2023, 2:23 PM
warning and the logs were much more populated because of that. During this same time period, the DB performed nominally, showing hardly any load and the top waits were minimal on
SLOW SQL >= 200ms
The strangest thing about this whole thing is that we recreated pods multiple times and still every time the new pod used 100% CPU. It gets stranger because as I was profiling, all the CPU finally dropped (see image). I do have one cpu profile from before and one from this morning that I can share, but unfortunately, nothing stands out to me. I'm attaching the flame graphs as images as well.
SELECT * FROM "tags" WHERE ("tags"."artifact_id","tags"."dataset_uuid") IN (($1,$2))
Blake Jackson11/14/2023, 3:09 PM
or do you need to see the entire SQL statement? If the former, I can grab a few more. If the latter, I need to confirm there's nothing in there I can't share
UPDATE "executions" ...
SELECT * FROM "task_executions" WHERE "task_executions"."project" = ... UPDATE "node_executions" SET ... SELECT * FROM "executions" WHERE "executions"."execution_project"...