Faisal Anees
06/05/2023, 7:10 AMget_data
task seemed stuck in Running. I noticed that the node group had nodes each with 2 CPUs so I ended up updating the nodes to run on 4 CPUs each. This "i think" got me out of the queued state, but now the task was failing - leading to the next issue
2. Task failing with died with <Signals.SIGKILL: 9>
error : This was the log for the failed task. I searched some slack threads and someone mentioned that this error might be happening due to OOM but not sure if that's the case here as each node had 16GB of memory. Isn't that sufficient ?
[1/1] currentAttempt done. Last Error: USER:: │
│ ❱ 760 │ │ │ │ return __callback(*args, **kwargs) │
│ │
│ /usr/local/lib/python3.10/site-packages/flytekit/bin/entrypoint.py:508 in │
│ fast_execute_task_cmd │
│ │
│ ❱ 508 │ subprocess.run(cmd, check=True) │
│ │
│ /usr/local/lib/python3.10/subprocess.py:526 in run │
│ │
│ ❱ 526 │ │ │ raise CalledProcessError(retcode, process.args, │
╰──────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command '['pyflyte-execute', '--inputs',
'<s3://flyte-cluster-bucket-2023/metadata/propeller/flytesnacks-development-ff067>
d646b0684b76a94/n0/data/inputs.pb', '--output-prefix',
'<s3://flyte-cluster-bucket-2023/metadata/propeller/flytesnacks-development-ff067>
d646b0684b76a94/n0/data/0', '--raw-output-data-prefix',
'<s3://flyte-cluster-bucket-2023/data/2b/ff067d646b0684b76a94-n0-0>',
'--checkpoint-path',
'<s3://flyte-cluster-bucket-2023/data/2b/ff067d646b0684b76a94-n0-0/_flytecheckpoi>
nts', '--prev-checkpoint', '""', '--dynamic-addl-distro',
'<s3://flyte-cluster-bucket-2023/flytesnacks/development/4MOWXYYMZXUPWCJJKGSQ6EOI>
24======/script_mode.tar.gz', '--dynamic-dest-dir', '/root', '--resolver',
'flytekit.core.python_auto_container.default_task_resolver', '--',
'task-module', 'example', 'task-name', 'get_data']' died with <Signals.SIGKILL:
9>.
Can someone please help me out ?David Espejo (he/him)
06/05/2023, 12:35 PMkubectl get po -n flytesnacks-development
and then
kubectl describe po <your-task-pod-name> -n flytesnacks-development
Samhita Alla
Faisal Anees
06/05/2023, 3:41 PM@task(requests=Resources(cpu="1", mem="500Mi"), limits=Resources(cpu="2", mem="800Mi"))