morning. how do folks do cost tracking on the clou...
# flyte-support
d
morning. how do folks do cost tracking on the cloud provider for per workflow or even per workflow execution basis. I understand it's difficult to do so because of kubernetes bin packing. we've been able to get aggregate costs by having flyte only schedule pods on flyte specific nodes, but wanted to hear if more granularity is possible
s
Hey @delightful-queen-21464, Flyte does ship with a series of Grafana dashboards that provide some metrics to show overall size/utilization of the cluster, but not explicit cost info. Cost observability is a feature under active development in Union. As you said, per-workflow/per-execution cost will ultimately be an estimate, but we can get closer than aggregate node uptime. I would love to grab a few minutes with you to understand your current user flow (some other AV folks have a process of estimating cost per scene and I would like to learn how you think about this). Down for a chat?
c
Hey @silly-toddler-37820, thanks for taking the time to respond. I was the one that originally had this question. Basically, I just want to know the best practice for tracking costs across workflows. Right now we just have the aggregate cost for all flyte usage, but we would like a bit more granularity. Happy to chat if necessary
s
@chilly-beach-76650 Flyte already tags k8s resources (pods) with the information about the task/workflow/namespace with which they are associated. From there you'd want to deploy a monitoring solution (prometheus, datadog, etc) that collects allocated memory, CPU, etc. Then you need to run the translation between those allocated values and an estimate of the cost per node. Some decisions are involved in translating compute node costs to pods, since multiple pods from unrelated workflows can run on the same node. There are a handful of OSS tools that can help with this I think
c
thanks for the detailed response