Hello Flyte Community, After upgrading from 1.13 t...
# flyte-support
b
Hello Flyte Community, After upgrading from 1.13 to 1.14.1 we encountered Access denied errors. Nothing in terms of our IAM had changed. Upon further digging it seems that the init container is missing from the pod specs in flyte namespaces. We use a default pod template. The podtemplate seems to be available in cluster resources and config map but is not being read properly. Looking at the changelog I can't quite point to what would cause this. Any help appreciated
I feel lit has something to do with this PR https://github.com/flyteorg/flyte/pull/5750/files
our pod template is simple and used to specify a volume mount
Copy code
apiVersion: v1
kind: PodTemplate
metadata:
  name: pod-template
template:
  metadata:
    name: pod-template
  spec:
    initContainers:
      - name: init
        image: alpine
        command:
....
"Note: Init containers can be configured with similar granularity using “default-init” and “primary-init” init container names." https://docs.flyte.org/en/latest/deployment/configuration/general.html#runtime-podtemplates is the only reference to this change. I tried this with no luck
Copy code
spec:
    initContainers:
      - name: default-init
c
What init containers was your pod template targeting? I thought the only regular init container would by flyte copilot. I'm a bit confused since prior to my change there was no pod template support for init containers so I'd expect prior config to not work.
b
there was none, or it wasn't documented in flyte. we made it work based on https://kubernetes.io/docs/concepts/workloads/pods/init-containers/
it worked in 1.13, we had a default pod template that defined an init container with volumeMounts for all pods created
Copy code
plugins = {
          k8s = {
            default-pod-template-name = "pod-template"
          }
        }
f
@clean-glass-36808 those should work
It’s default pod template
c
I didn’t think that Flyte evaluated the initContainer section of pod templates in older versions which is why I’m confused by what you “made work”.
That is the whole reason why we made the change to support init containers, but Im probably missing something here
I took another look at the code and I still don't understand what the issue could be. v1.13.3 just added Flyte customizations to init containers which did not consult a pod template: https://github.com/flyteorg/flyte/blob/v1.13.3/flyteplugins/go/tasks/pluginmachinery/flytek8s/pod_helper.go#L352-L358
b
can you provide an example of a pod template using an init container
removing all values. this is what our template looks like https://kubernetes.io/docs/concepts/workloads/pods/init-containers/
Copy code
apiVersion: v1
kind: PodTemplate
metadata:
  name: pod-template
template:
  metadata:
    name: pod-template
  spec:
    initContainers:
      - name: default-init
        image: alpine
        command:
        args:
        volumeMounts:
          - name:
            mountPath:
    containers:
      - name: default
        image: rwgrim/docker-noop
        volumeMounts:
          - name: 
            mountPath:
        terminationMessagePath: /dev/foo
    volumes:
      - name: 
        ephemeral:
          volumeClaimTemplate:
            spec:
              ...
c
Copy code
---
apiVersion: v1
kind: PodTemplate
metadata:
  name: flyte-template
  namespace: example
template:
  spec:
    affinity:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: "<http://example.com/compute|example.com/compute>"
              operator: In
              values:
              - non-gpu
    tolerations:
    - key: "<http://example.com/compute|example.com/compute>"
      operator: "Equal"
      effect: "NoSchedule"
      value: "true"
    initContainers:
      - name: default-init
        image: <http://docker.io/rwgrim/docker-noop|docker.io/rwgrim/docker-noop>
        env:
          - name: AWS_ENDPOINT_URL
            value: "example"
          - name: AWS_CA_BUNDLE
            value: /etc/ssl/certs/bundle.crt
          - name: AWS_ACCESS_KEY_ID
            valueFrom:
              secretKeyRef:
                name: "example"
                key: access-key-id
        volumeMounts:
          - name: stack-cert-bundle
            mountPath: /etc/ssl/certs
            readOnly: true
...
b
It doesn't seem like we have it configured wrong. Where is pod template defined in the helm chart? We are using flyte-binary and have it defined in two locations.
plugins.k8
and
clusterResourceTemplates.inline
and just to verify that this isn't anything that changed in our cluster, reverting back to 1.13 fixes the issue
Also please see this issue where @average-finland-92144 helped use initContainers https://github.com/flyteorg/flyte/issues/5376#issuecomment-2176958407
c
We are setting it in flyte-core just for copilot since copilot injects an init container for container tasks.
Copy code
flyte-core:  
  flyte-propeller:
    copilot:
      plugins:
        k8s:
          default-pod-template-name: flyte-template
I'll have to look into the cluster resource template and the k8s plugin to see how those work
b
Thank you
in binary that would be
configuration.inline.plugins.k8.default-pod-template-name
c
Where are your init containers coming from by the way? Are you configuring them in task config?
b
they are in the default pod template
I do not mind changing how we define init containers, however default-init does not seem to work in binary
c
I see.. So you're injecting init containers into your tasks via the pod template.
I understand now
Will probably need a wider discussion with the flyte folks but you were using the pod template to inject new containers into flyte pods but I am under the impression that runtime pod template are designed to augment the pods generated by Flyte. Per the docs
Copy code
In this scheme, if the default PodTemplate contains a container with the name "default", that container will be used as the base configuration for all containers Flyte constructs
Since Flyte is not generating any init containers, your pod template does not match any init containers, and that is why in v1.14 you're seeing them get wiped out: https://github.com/flyteorg/flyte/blob/448aba97201ba42297282d859e6064b7f89537ae/flyteplugins/go/tasks/pluginmachinery/flytek8s/pod_helper.go#L6[…]84 Earlier versions of Flyte unconditionally used the pod template init containers verbatim. So I think if you want your tasks to have an init container (in v1.14) you'd drop the default pod template and move to using compile time pod templates: https://docs.flyte.org/en/latest/deployment/configuration/general.html#compile-time-podtemplates
b
we want all workflows to have init as a default. using compile time and expecting all users to decorate their tasks is a bit much
c
Yeah I understand. That is why I think wider discussion with the flyte folks would be helpful to understand the original intention of runtime pod templates and whether we want to change the behavior moving forward or if there are alternatives. CC: @freezing-airport-6809
b
so what I am reading here is at this time for v1.14 there is no support for runtime init containers for flyte-binary
c
From what I can tell yeah and for all flyte deployment types
a
So I think the issue here is with default template and not the runtime ones right? @boundless-lifeguard-61788 would you mind filing an Issue? The fact that rolling back fixes your issue is concerning
c
I think @boundless-lifeguard-61788 was leveraging an unintended behavior of how pod template merging worked prior to intentionally supporting init containers imo
But we can discuss on the issue
b
@clean-glass-36808 1.14 doesn't seem to support the combination binary + runtime template + default template
@average-finland-92144 https://github.com/flyteorg/flyte/issues/6204 Thank you