the new gant chart features is super awesome! tha...
# flytekit
d
the new gant chart features is super awesome! thanks for this release. it does however re-raise a question as to why there is so much runtime overhead launching a workflow
cc @Varun Kulkarni
we could make this a task, but there are functional differences between tasks/workflows so we'd rather not
also, another question: why does cache reading take 13s?
k
that sounds wrong
it seems the deployment needs to be optimized
we can help
d
Something wrong with our propeller setup?
k
Ya seems like it
d
update: flagging this for the admins
actually, they're here 🙂 @Babis Kiosidis thoughts on this?
b
can take a look next week 👍 thanks
is this cache latency affected by the
workflow-reeval-duration
config? Which we intentionally set to 30s (from default 10s) to reduce the amount of checks during peak hours? wdyt @Ketan (kumare3)?
k
I think the biggest thing would be - max-streaks
We can safely make it 10 or even more
b
we set:
Copy code
max-streak-length: 10
workflow-reeval-duration: 10s
but both had 0 effect on these executions
Would it make sense to set a lower value at downstream eval? the current value is 30s, would 5s work or is there risk of side-effects ?
Copy code
downstream-eval-duration: 30s
👍 1
k
@Babis Kiosidis that is interesting. Downstream eval is more important than workflow reval. But it seems There is some other problem. Are there too many old workflows hanging around? Cc @Dan Rammer (hamersaw)
b
about 4000 flyte workflow resources in our gke cluster currently, which seems a bit elevated
k
That is tiny
Let me ping
b
we may expect around 2000+- at this time of the day i guess. (rough numbers)
k
I have not seen a problem till 50k+
But, mostly because of k8s throttling
We can also use sharded propeller
d
sounds fancy 😉
139 Views