Hello :wave: We noticed that sometimes the map tas...
# ask-the-community
h
Hello 👋 We noticed that sometimes the map task is stuck at initialising stage for a very long time (on its first run or a rerun). Its usually either when its on ContainerCreating or Pod Initialisation state for normal tasks but whenever this happens for a map_task, we don’t really see any pods that are at container creating state. I am aware that it usually takes some mins to get the pod running but in this case it is stuck in this state for say more than 6 or 7 h.. Does anyone know why this can be happening on map_tasks ?
Screenshot 2024-04-09 at 17.13.20.png
s
are you using the array node map task?
h
No, we are using the regular map_task from flytekit
Structure looks like this:
Copy code
from flytekit import map_task

@dynamic(
    cache=False if os.environ.get("ENVIRONMENT") == "test" else True,
    cache_serialize=False if os.environ.get("ENVIRONMENT") == "test" else True,
    cache_version="1.0",
    container_image=TF_IMAGE,
)
def dynamic_map_a_wrapper_task(input_path: List[MapInput]) -> List[str]:
    return map_task(
        task_function=a_task,
        concurrency=const.N_WORKERS,
        min_success_ratio=const.MIN_SUCCESS_RATIO,
    )(input=input_path)
The delays that we see are mostly because of the task being stuck at initialising state.
We run this task for 7 different inputs triggering at the same time and each input has 3 sets of map tasks running (2 of them with 800 and 1 with 1600 partitions to process). And N_WORKERS are set to 20 here. Do you think, this could be causing the tasks to be waiting in the queues to be processed because of less number of workers to handle that task, hence being at initialised state ?
s
could you give array node map task a try? we're making it the default in our next release (check our beta release), and that's the one we'll be heavily investing in. if you're still seeing this issue with array node map task, we can investigate further.
h
We haven’t really explored this one. Will give this a try and get back. Thank you 🙂
y
can you look to see what’s happening on the backend? what are the k8s pod statuses?
h
We couldn’t see any pods spawning in the cluster when this happens and no logs. Usually for regular tasks, we see that pods are spawned and they are mostly in ContainerCreating state but not with this case.