Laura Lin
12/06/2022, 12:14 AMEduardo Apolinario (eapolinario)
12/06/2022, 12:44 AMLaura Lin
12/06/2022, 1:12 AMflyte-core-v1.2.0
Jay Ganbat
12/06/2022, 1:32 AMDan Rammer (hamersaw)
12/06/2022, 2:12 AMLaura Lin
12/06/2022, 2:13 AMDan Rammer (hamersaw)
12/06/2022, 2:16 AMLaura Lin
12/06/2022, 2:17 AMDan Rammer (hamersaw)
12/06/2022, 2:18 AMLaura Lin
12/06/2022, 2:23 AM(combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "29dc9ef6819b16e613927333fbc2a069b819c5346d69628d580e817e9b1cf8d0": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
it happens sometimes when karpenter spins up new nodes, haven't quite figured it out yet either.
so it gets stuck on container creatingDan Rammer (hamersaw)
12/06/2022, 2:29 AMLaura Lin
12/06/2022, 2:46 AMcode:"UnexpectedObjectDeletion" message:"object
or something similar. I don't remember exactly but it was some kind of deleted object errorDan Rammer (hamersaw)
12/06/2022, 2:47 AMSYSTEM::object [flytesnacks-development/f0632d6d37de845a7937-n1-3] terminated in the background, manually
Does that look familiar?Laura Lin
12/06/2022, 3:47 AMDan Rammer (hamersaw)
12/06/2022, 3:37 PM@task(retries=3)
def foo:
# omitted
@workflow
def bar:
mapped_out = map_task(foo)(a=a).with_overrides(retries=3)
# omitted
Eduardo Apolinario (eapolinario)
12/06/2022, 6:35 PMLaura Lin
12/06/2022, 6:52 PMRecoverable vs. Non-Recoverable failures: Recoverable failures will be retried and counted against the task's retry count. Non-recoverable failures will just fail, i.e., the task isn't retried irrespective of user/system retry configurations. All user exceptions are considered non-recoverable unless the exception is a subclass of FlyteRecoverableException.
Dan Rammer (hamersaw)
12/06/2022, 6:53 PMDavid Espejo (he/him)
01/19/2023, 4:28 PMLaura Lin
01/19/2023, 4:31 PM