Hey team we're running map tasks and seeing a lot ...
# flyte-support
d
Hey team we're running map tasks and seeing a lot of propeller pod crashlooping due to panics from some variation of
concurrent map writes
. we are convinced that this is happening on map task processing and specifically the pod spec of the map task. we're running
v1.13.3
. stacktraces in the thread
Untitled
f
I think this was fixed in a later version?
h
Checking
d
great, any version we should update to? I found this https://github.com/flyteorg/flyte/issues/6038 but it was related to the flyteadmin, not propeller
any update here? @high-park-82026
just to add on here, we are seeing issues that may be related to the panics, as they are definitely happening on the map tasks as well. these workflows fail with 1.
RuntimeExecutionError: max number of system retry attempts [51/50] exhausted. Last known status message: AlreadyExists: Event already exists, caused by [rpc error: code = AlreadyExists desc = have already recorded task execution phase RUNNING (version: 44) for task_id
2.
RuntimeExecutionError: max number of system retry attempts [61/60] exhausted. Last known status message: [system] unable to read futures file, maybe corrupted, caused by: [system] Failed to read futures protobuf file
cc: @many-salesclerk-79629