strong-plumber-41198
12/11/2023, 5:14 PMOOMKilled
. I’ve tried increasing the flytepropeller.resources.limits.memory
and flytepropeller.resources.resources.memory
values, but this didn’t seem to have had any effectaverage-finland-92144
12/11/2023, 9:21 PMstrong-plumber-41198
12/12/2023, 10:36 AM10Gi
and it still gets OOMKilled
at the same point every time, here is the full error log from the UI:
[1/1] currentAttempt done. Last Error: USER::Pod failed. No message received from kubernetes.
[fc5d90298b68749979ad-n0-0] terminated with exit code (137). Reason [OOMKilled]. Message:
1880 [================>.............] - ETA: 0s
17801216/26421880 [===================>..........] - ETA: 0s
20733952/26421880 [======================>.......] - ETA: 0s
22904832/26421880 [=========================>....] - ETA: 0s
25673728/26421880 [============================>.] - ETA: 0s
26421880/26421880 [==============================] - 1s 0us/step
Downloading data from <https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz>
5148/5148 [==============================] - 0s 0us/step
Downloading data from <https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz>
8192/4422102 [..............................] - ETA: 0s
49152/4422102 [..............................] - ETA: 7s
81920/4422102 [..............................] - ETA: 9s
327680/4422102 [=>............................] - ETA: 2s
573440/4422102 [==>...........................] - ETA: 1s
851968/4422102 [====>.........................] - ETA: 1s
1474560/4422102 [=========>....................] - ETA: 0s
2121728/4422102 [=============>................] - ETA: 0s
3375104/4422102 [=====================>........] - ETA: 0s
4422102/4422102 [==============================] - 1s 0us/step
.
average-finland-92144
12/12/2023, 1:03 PMstrong-plumber-41198
12/12/2023, 1:34 PMkubectl top node
, currently there is a node consuming 24% memoryproud-answer-87162
12/12/2023, 4:03 PMaverage-finland-92144
12/12/2023, 4:44 PMkubectl get pods -n flytesnacks-development
then
kubectl get pod <execution-id-Pod> -n flytesnacks-development -o yaml
and find the spec.resources
blockstrong-plumber-41198
12/12/2023, 4:47 PMspec.resources
block:
resources:
limits:
cpu: "1"
memory: 1Gi
requests:
cpu: "1"
memory: 1Gi
average-finland-92144
12/12/2023, 8:51 PM@task(requests=Resources(
cpu="1",
mem="4Gi")
strong-plumber-41198
12/13/2023, 9:32 AMRPC Failed, with Status: StatusCode.INVALID_ARGUMENT
details: Requested MEMORY default [4Gi] is greater than current limit set in the platform configuration [1Gi]. Please contact Flyte Admins to change these limits or consult the configuration
strong-plumber-41198
12/13/2023, 9:33 AMvalues.yaml
with you if that will help troubleshoot this?