https://flyte.org logo
#ask-ai
Title
# ask-ai
o

Olivier Sevin

02/29/2024, 7:31 PM
@Glime does not being able to create a pod because of resource quotas count towards system retry attempts?
g

Glime

02/29/2024, 7:32 PM
Yes, not being able to create a pod because of resource quotas does count towards system retry attempts. In the context provided, the reason for the task being in a
WAITING_FOR_RESOURCES
state is due to exceeding the resource quota, specifically the memory limits set in the project quota. This situation is identified as a
BackOffError
caused by exceeded quota, which is considered a system failure. According to the Flyte documentation and discussions, system failures, including issues like exceeded resource quotas, trigger system retries. System retries are designed to handle transient errors or system-level issues that are not directly caused by the user's task logic. The system retry mechanism is part of Flyte's resilience features, ensuring that tasks have the opportunity to run successfully in the face of temporary system constraints or failures. The number of system retries and the behavior during these retries, such as running on spot instances for cost efficiency and eventually moving to a non-interruptible machine for the last retry to ensure completion, are configurable aspects of Flyte's execution environment. In the provided context, the exceeded resource quota falls under the category of system failures that Flyte attempts to mitigate through system retries, aiming to successfully execute the task when the system conditions allow for it.
2 Views