https://flyte.org logo
#ask-the-community
Title
# ask-the-community
c

Cornelis Boon

03/20/2024, 11:07 AM
Hmm. I see that when you use
GPUAccelerator
like in:
Copy code
@task(container_image=image_spec, requests=Resources(cpu="1", mem="2G"), accelerator=GPUAccelerator("nvidia-l4"), limits=Resources(cpu="4", mem="7G", gpu="1"))
It will set
Copy code
affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: <http://k8s.amazonaws.com/accelerator|k8s.amazonaws.com/accelerator>
            operator: In
            values:
            - nvidia-l4
in the podspec of the task, (even when you're not on AWS...)
Ah. I need to provide a
gpu-device-node-label
in the values to configure this correctly
d

David Espejo (he/him)

03/22/2024, 4:18 PM
Hey Cornelis, could you collect all the gaps you may find in the GPU Accelerators documentation on an Issue? It's a great feature that I think deserves a more descriptive doc
c

Cornelis Boon

03/22/2024, 4:27 PM
Honestly, that's it. The tolerations explanation is clear here https://docs.flyte.org/en/latest/user_guide/productionizing/configuring_access_to_gpus.html. Just need to add
gpu-device-node-label
and
gpu-partition-size-node-label
such as here https://github.com/flyteorg/flyte/blob/b6f35add1227c930e9208103235d89fe4b864b8e/charts/flyte-binary/gke-starter.yaml#L82-L83 to make it work fully on other platforms than AWS.