curved-whale-1505
04/22/2025, 6:34 PMW0422 18:31:31.977765 7 reflector.go:535] pkg/mod/k8s.io/client-go@v0.28.2/tools/cache/reflector.go:229: failed to list *v1.PyTorchJob: <http://pytorchjobs.kubeflow.org|pytorchjobs.kubeflow.org> is forbidden: User "system:serviceaccount:flyte:flyte-backend-flyte-binary" cannot list resource "pytorchjobs" in API group "<http://kubeflow.org|kubeflow.org>" at the cluster scope
E0422 18:31:31.977797 7 reflector.go:147] pkg/mod/k8s.io/client-go@v0.28.2/tools/cache/reflector.go:229: Failed to watch *v1.PyTorchJob: failed to list *v1.PyTorchJob: <http://pytorchjobs.kubeflow.org|pytorchjobs.kubeflow.org> is forbidden: User "system:serviceaccount:flyte:flyte-backend-flyte-binary" cannot list resource "pytorchjobs" in API group "<http://kubeflow.org|kubeflow.org>" at the cluster scope
Am I missing any steps to give flyte permissions after installing KF Training Operator?curved-whale-1505
04/22/2025, 6:44 PM<http://pytorchjobs.kubeflow.org|pytorchjobs.kubeflow.org> is forbidden: User "system:serviceaccount:flyte:flyte-backend-flyte-binary" cannot create resource "pytorchjobs" in API group "<http://kubeflow.org|kubeflow.org>" in the namespace "flytesnacks-development"
curved-whale-1505
04/22/2025, 6:53 PMcurved-whale-1505
04/22/2025, 10:46 PMearly-addition-41415
04/26/2025, 9:52 PMearly-addition-41415
04/26/2025, 9:52 PMcurved-whale-1505
04/26/2025, 9:55 PM// Create Cluster Role and ClusterRoleBinding for PyTorchJob access
const flyteKubeflowRole = new eks.KubernetesManifest(this, 'flyte-kubeflow-role', {
cluster: props.cluster,
manifest: [
{
apiVersion: '<http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>',
kind: 'ClusterRole',
metadata: {
name: 'flyte-pytorch-job-role',
},
rules: [
{
apiGroups: ['<http://kubeflow.org|kubeflow.org>'],
resources: ['pytorchjobs', 'pytorchjobs/status'],
verbs: ['get', 'list', 'watch', 'create', 'update', 'patch', 'delete'],
},
],
},
],
});
const flyteKubeflowRoleBinding = new eks.KubernetesManifest(this, 'flyte-kubeflow-rolebinding', {
cluster: props.cluster,
manifest: [
{
apiVersion: '<http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>',
kind: 'ClusterRoleBinding',
metadata: {
name: 'flyte-pytorch-job-rolebinding',
},
roleRef: {
apiGroup: '<http://rbac.authorization.k8s.io|rbac.authorization.k8s.io>',
kind: 'ClusterRole',
name: 'flyte-pytorch-job-role',
},
subjects: [
{
kind: 'ServiceAccount',
name: flyteServiceAccountName,
namespace: flyteNamespaceName,
},
],
},
],
});
early-addition-41415
04/26/2025, 10:05 PMcurved-whale-1505
04/26/2025, 10:10 PMearly-addition-41415
04/26/2025, 10:11 PMcurved-whale-1505
04/26/2025, 10:13 PMaverage-finland-92144
04/29/2025, 8:02 PMaverage-finland-92144
04/29/2025, 8:03 PMaverage-finland-92144
04/29/2025, 8:22 PM<http://kubeflow.org|kubeflow.org>
vs <http://trainer.kubeflow.org|trainer.kubeflow.org>
)
We can always instruct the clusterresources component of Flyte to create additional K8s resources like the ClusterRole, using something like this in the values:
clusterResourceTemplates:
inline:
001_clusterrole.yaml: |
apiVersion: v1
kind: ClusterRole
metadata:
name: 'flyte-pytorch-role'
...