justin hallquist02/28/2023, 9:03 PM
this is ubiquitous across all of our tasks and we're running a rather deep DAG locally, in serial (no --remote) it runs blazing fast (couple minutes) however, with the ~1min stall on each pod, it takes > 30minutes we'd rather not recombine tasks together if we don't need to so is there something we're missing here?
2023-02-28T15:52:39-05:00 tar: Removing leading `/' from member names 2023-02-28T15:53:09-05:00 2023-02-28T15:53:31-05:00 2023-02-28 20:53:31.706 | INFO STARTED TASK
Aswanth Krishnan03/01/2023, 10:33 AM
justin hallquist03/01/2023, 2:36 PM
so I believe so?
Aswanth Krishnan03/01/2023, 5:03 PM
justin hallquist03/01/2023, 6:25 PM
resort to fast registration. https://docs.flyte.org/projects/cookbook/en/latest/getting_started/package_register.html#productionizing-your-workflows IMO, this should help reduce the time because the code will be present in the docker image and won't be pulled from s3.
justin hallquist03/02/2023, 9:58 PM
it does not get applied?
justin hallquist03/02/2023, 10:04 PM
was set as above:
task_resource_defaults: # -- Task default resources parameters task_resources: defaults: cpu: 100m memory: 200Mi storage: 5Mi limits: cpu: 1 memory: 1Gi storage: 20Mi gpu: 1
was not applied to the pod the definition in the console showed the correct value but the pod itself had the helm limit that makes sense because those are task limits, but because there was no error stating we were going above the limit, it became a fools errand