silly-book-73230
01/13/2025, 12:04 PM@task(
requests=Resources(cpu="8", mem="54Gi", gpu="2"),
limits=Resources(cpu="100", mem="1Ti"),
pod_template=PodTemplate(
pod_spec=V1PodSpec(
containers=[
V1Container(
name="primary",
),
],
node_selector={
"cloud.google.com/gke-accelerator": "nvidia-l4",
"cloud.google.com/gke-accelerator-count": "2",
},
)
),
)
I see that Flyte also has a features for selecting GPUs: https://docs.flyte.org/en/latest/api/flytekit/extras.accelerators.html
However, if I remove the pod_template and just add the accelerator kwarg, then the flytepropellor gives the following error:
│ E0113 12:02:55.686281 1 workers.go:103] error syncing '-': failed at Node[-]. Runt │
│ imeExecutionError: failed during plugin execution, caused by: failed to execute handle for plugin [container]: [GKE Warden constraints violat │
│ ons[] failed to create resource, caused by: admission webhook "warden-validating.common-webhooks.networking.gke.io" denied the request: GKE W │
│ arden rejected the request because it violates one or more constraints. │
│ Violations details: {"[denied by autogke-gpu-limitation]":["When requesting 'nvidia.com/gpu' resources, you must specify either node selector │
│ 'cloud.google.com/gke-accelerator' with accelerator type or node selector 'cloud.google.com/compute-class' with existing custom compute clas │
│ s which has at least one GPU priority rule."]}
This suggests that the right GKE config is not properly set by providing the accelerator kwarg. Is this supposed to happen? If not, what is the point of the accelerator kwarg?gentle-tomato-480
01/13/2025, 12:43 PMgentle-tomato-480
01/13/2025, 12:44 PM@task(container_image=image_spec, requests=Resources(cpu="1", mem="2G"), accelerator=GPUAccelerator("nvidia-l4"), limits=Resources(cpu="4", mem="7G", gpu="1"), timeout=timedelta(minutes=10))
gentle-tomato-480
01/13/2025, 12:48 PMgentle-tomato-480
01/13/2025, 12:55 PMgpu
request should be set in the limits
instead of the requests
https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/#using-device-pluginssilly-book-73230
01/13/2025, 1:54 PMgentle-tomato-480
01/13/2025, 1:56 PMflytekit
version 1.13.7
and flyte-binary
chart version 1.13.2
freezing-airport-6809
silly-book-73230
01/13/2025, 3:39 PMsilly-book-73230
01/13/2025, 3:40 PMgentle-tomato-480
01/13/2025, 3:40 PMsilly-book-73230
01/13/2025, 3:40 PMgentle-tomato-480
01/13/2025, 3:41 PMgpu
limit will be smart enough to find you a GPU that fulfills your resource requestgentle-tomato-480
01/13/2025, 3:42 PM