I’ve been trying to get
runtime PodTemplates working, but struggling to.
I have deployed the following PodTemplate in the
flyte-backend
namespace - this is where the Flyte binary is deployed. I’ve also deployed this in the Flyte project/domain workspace, but the same issue happens.
apiVersion: v1
kind: PodTemplate
metadata:
name: flyte-template-test
namespace: flyte-backend
template:
spec:
containers:
- name: default
image: <http://docker.io/rwgrim/docker-noop|docker.io/rwgrim/docker-noop>
terminationMessagePath: "/dev/foo"
hostNetwork: false
Then I try and use it with a super simple test:
@task(pod_template_name="flyte-template-test")
def check() -> bool:
return True
Submit it, and it errors with
Workflow[playground:development:using_templates.template_wf.gpu_workflow] failed. RuntimeExecutionError: max number of system retry attempts [11/10] exhausted. Last known status message: failed at Node[n0]. RuntimeExecutionError: failed during plugin execution, caused by: failed to execute handle for plugin [container]: [BadTaskSpecification] PodTemplate 'flyte-template-test' does not exist
I’ve been through the permissions, and the default cluster role should have permission?
...
resources:
- podtemplates
- verbs:
- create
- delete
- deletecollection
- get
- list
- patch
- post
- update
- watch
Does anyone have any pointers? Should be able to do some really cool stuff once I’ve got this working