we've been wondering if anyone has experience in running a multi-arch cluster/workflow.
Currently, we are running the flyte cluster on amd64 but have a task where we would like to e.g. build a docker image using kaniko on an arm64 node.
What keeps us from doing so is that flyte-copilot seems to use the amd64 built image and not "know" that he is going to be executed on an arm64 node (which is ensured by using node selection).
Is there a way, besides hosting the full setup on arm64, to make flyte-copilot more "flexible"?
04/06/2023, 1:33 PM
Hmm do you use raw containers
And copilot can be built for multiple architectures
There are people who use Flyte with arm - lyft I think
You should simply build your image for arm / multi arch and yup then flytekit will work (you do not need copilot in this case)
04/06/2023, 8:23 PM
Is there a reason why you want to execute the docker build as part of the Flyte workflow? A pattern we've been following is using Flyte to train our main model artifact and then using something like GitHub Actions to actually do the container build step and pass in the s3 path of the artifact as an input to the GitHub workflow (also you have easy access to runners w/ different architectures)
Ferdinand von den Eichen
04/11/2023, 11:54 AM
Sometimes you need to build several images as part of a pipeline. We have one such case where multiple - i.e. dynamically 10-50 - models need to be baked into multiple (arm based) images for Edge deployments…
04/11/2023, 7:22 PM
Got it - even in this case, there are a few ways to do this I could see that avoid needing to actually perform the container builds in the orchestration system. Two slightly different approaches would be:
1. The flyte pipeline dynamically fans out to train N models and saves them in a location in S3 with a common prefix. Then, a downstream GitHub Actions job reads all model artifacts from the location in S3 and creates/pushes the container images (even better, if the majority of the contents of your container is the same, the only layer that may differ is the model)
2. The flyte pipeline can trigger GitHub Actions or some other CI system to run a job, with an input argument that points at the saved model artifact (via the 'workflow dispatch' API in the GitHub Actions case)
Also you may be able to get away w/ docker buildx to produce multi-arch images even if your host isn't an arm-based host