Hi everyone, I've encountered an intriguing patte...
# ask-the-community
p
Hi everyone, I've encountered an intriguing pattern while working with a dynamic workflow containing over 10 tasks. Within this workflow, the node IDs exhibit an interesting behavior: they follow a predictable sequence up to dn9 (e.g., dn0, dn1, dn2, ..., dn9), but starting from dn10, they switch to seemingly random strings like "fhfyukpa". This pattern is reflected both in the Flyte dashboard and when I attempt to access the inputs and outputs of these nodes during cluster runs. What is the reason behind it and by what logic these ID-s are generated? This would be useful for me to reach the inputs and outputs in the cluster runs.
To make it more clear, I've attached a screenshot illustrating the issue using a simplified example. Up to dn9, the node naming follows a predictable pattern, but beyond that point, UUIDs are used as node names. And it would be important for me to anticipate the node IDs to retrieve their inputs and outputs. How can I query the node id-s in these cases, and what drives the shift in the naming pattern after the 10th node?
d
Do you have a deeply nested structure here? Node IDs are hashed if they exceed a certain number of characters, which is what is happening in this scenario.
p
Oh, it makes sense. Could you tell me what the hash function is or where I can find it?
Because having access to the hashed ID allows me to query the node execution data, but based on the pattern I could build only the unhashed ID.
d
Hey @Péter Kun, code is located here. It may make sense to make this value configurable, it's relatively small to reduce storage requirements in etcd (workflow execution status) and admin db.
p
Great, thanks a lot!