If I understand correctly, `@dynamic` is a workflo...
# ask-the-community
s
If I understand correctly,
@dynamic
is a workflow whose DAG is created during runtime (unlike
@workflow
whose DAG is created during compile time), which makes it more flexible. So my question is, what are the downsides of
@dynamic
compared to a static
@workflow
? Why aren’t all flyte workflows
@dynamic
(kind of like how all PyTorch computational graphs are by default dynamic)? Are there meaningful performance penalties to using
@dynamic
?
g
You don't get the safety of
@workflow
(compiled before registration). Dynamics being compiled at runtime means you can run into runtime issues that would've been caught by compiling the workflow (from my experience)
s
Right - but wouldn’t that be caught during local testing?
g
For the most part, yes. There's also the overhead of storing the entire DAG in the head dynamic node. I love dynamics, but have had issues with scale (when running 1,000s of tasks in a dynamic)
s
Ah ok, this was what I wanted to know - if there are tradeoff in terms of performance I need to worry about since I don’t understand the internals very well
Thank you so much for the input! BTW, if you don’t mind me asking, what workflow required 1,000s of tasks?
@Greg Gydush
g
We have thousands of samples, each sample can have 50GB+ of data, so we run at least 1 task per sample, leading to thousands of tasks!
k
Dynamic from a scale pov should behave similar to static, but you won’t be able to see the workflow before runtime
This can be a dealbreaker
Also caching is limited
s
@Ketan (kumare3) thanks - but why is caching limited?
k
You can generate a task, how can Flyte know if the generated task is similar to previous
Though for most part it works great today, after carefully orchestrating this
s
Couldn’t you technically compare the code at runtime instead of compile time to see if it is cached? I guess that’s easier said than done?
g
You can generate a task, how can Flyte know if the generated task is similar to previous
Can you comment on how this is done at the workflow level compared to dynamics? I thought caching was just interface/inputs/cache_version, which you'd have for both workflows and dynamics