Flyte enables production-grade orchestration for machine learning workflows and data processing created to accelerate local workflows to production.

Flyte

Hello. We're seeing an error when a dynamic task fans out to a high number of tasks.

```Workflow[...] failed. RuntimeExecutionError: max number of system retry attempts [31/30] exhausted. Last known status message: failed at Node[n0]. EventRecordingFailed: failed to record node event, caused by: EventSinkError: Error sending event, caused by [rpc error: code = Unknown desc = unexpected HTTP status code received from server: 413 (Request Entity Too Large); transport: received unexpected content-type "text/html"]```
Would tweaking the configuration linked help solve this? <https://docs.flyte.org/en/latest/deployment/configuration/performance.html#offloading-static-workflow-information-from-crd>

unfortunately no. The NodeExecutionEvent (<https://github.com/flyteorg/flyte/blob/master/flyteidl/protos/flyteidl/event/event.proto#L42-L133|https://github.com/flyteorg/flyte/blob/master/flyteidl/protos/flyteidl/event/event.proto#L42-L133>) is the proto message that's getting too big (because of the <https://github.com/flyteorg/flyte/blob/master/flyteidl/protos/flyteidl/event/event.proto#L151C5-L151C32|https://github.com/flyteorg/flyte/blob/master/flyteidl/protos/flyteidl/event/event.proto#L151C5-L151C32>).
We did some work recently to offload literals and it's in our internal roadmap to explore offloading of the dynamic closure, but unfortunately this is not a priority at the moment (so I can't promise a solid date yet)


<@U072ZEKG7V0>, I stand corrected, this feature ended up becoming a priority and it's being implemented in <https://github.com/flyteorg/flyte/pull/6234> (which should go out in 1.15, which is being released in a few days),

this might not be enough for your use case though (btw, can you give more details about this dynamic task? Like its overall structure, number of tasks, etc)

I just came to circle back and report that I actually think this is our nginx ingress being upset. I don't think flyte admin is actually even seeing the request so we'll need to tweak our ingress settings.

Interesting. Yeah, this makes sense. Let us know, ok?

We run the data plane and control plane in different clusters so the traffic traverses ingress LB. Yeah will do

It was indeed an ingress issue. Resolved with ingress annotation `<http://nginx.ingress.kubernetes.io/proxy-body-size|nginx.ingress.kubernetes.io/proxy-body-size>: "100m"`  sorry for the churn