Howdy! I am started to investigate image streaming in gcp as explained here instead of the usual docker pull from gcr. I am seeing image pulls drop from upwards of 120 seconds to ~5 seconds! I think this will really help speedup the initialization times in tasks in our workflows.
There are some caveats:
• You must use artifact registry. Note that gcr is deprecated anyways
• The first pull is a bit slower (4min instead of 2min in this particular case), but following pulls on new nodes benefit from the stream cache and this cost is amortized by any workflow with two tasks that run on two separate nodes one after the other, which is most of our workflows.
◦ by first pull I mean just one pull anywhere so that the stream can build the layer caching wherever it does. I do not mean the first pull on each machine. Just wanted to clarify this important difference!
Has anyone integrated this streaming approach in their deployment? Any concerns or gotchas to be aware of?