Hello ^_^ Before trying this out myself, has there been any attempts/thoughts on enabling nvidia-acceleration for the single container demo cluster?
No, there has been no attempts. If you happen to do it please share 🤯
Ok, got it working finally... based the sandbox on
instead of
and installed k3s and crictl during docker build... haven't worked with docker buildx before, so not sure how to beautify it ^_^ will see if I can clean it up a little bit further
The sandbox-cuda container comes in at a hefty 3.7GB, but will come in handy for our engineers I hope. Really nice sandbox you got there! 🙂
@Björn happy to host it with the core Flyte sandbox
And also add flytectl demo start — gpu?
Neat. We could also possibly consider a “multi-node” setup where one of the nodes has GPU drivers and can be fired up in an opt-in way. GPU workloads can then be configured to schedule on this node via the usual affinity.
How would you prefer to have it delivered? As a PR or just as files that you can fit into the build system according to your wishes? If PR, should I add it as a build target for the Makefile in sandbox-bundled or make a new sandbox directory you think?
i’m thinking probably a build target in sandbox-bundled. if you can open a draft PR, we can discuss further. Perhaps we can reorganize the current default stage to be based on some version of ubuntu that would make layering GPU drivers+cuda on easier too.
Will do!
Ack, realised I forgot to add the k3s config for the local repo 😕
ah well, let's see what you think about the PR first ^_^
To run the image you need to have an nvidia-enabled docker... I did this by installing (in ubuntu 20.04)
and use
like so:
  "runtimes": {
    "nvidia": {
      "path": "nvidia-container-runtime",
      "runtimeArgs": []
and the restart docker
also, during docker run you need to pass
--gpus all
as an arg
we will need to run on a cloud instance with gpu 🙂
are y’all running this on developer machines @Björn?
just curious if users get gpu-enabled dev machines, run their own hardware, or get a gpu-enabled VM in the cloud.
also some additional related docs: https://k3d.io/v5.4.6/usage/advanced/cuda/
@jeev During development I have been running this on a workstation at home... Hopefully our data scientists can run this either or their gpu enabled workstations or on a vertex ai workbench in GCP with attached T4 GPU
Saw some referenced to local k3d development as well, but not until I submitted the PR ^_^ I think the demo setup is really handy
@jeev I put some links in the PR from where I found info... This one was a nice one: https://itnext.io/enabling-nvidia-gpus-on-k3s-for-cuda-workloads-a11b96f967b0
awesome thanks. will take a look tomorrow 🙂