Hi team, When running a working flow which requir...
# ask-the-community
k
Hi team, When running a working flow which requires the data to be pulled from multiple tables, I'm getting this OOM error. What can be done to avoid this ?
Copy code
[1/1] currentAttempt done. Last Error: USER::Pod failed. No message received from kubernetes.
[a5dqkgsjw2gf59jzhpgn-n1-0] terminated with exit code (1). Reason [OOMKilled]. Message: 
tar: Removing leading `/' from member names

Traceback (most recent call last):
  File "/opt/venv/bin/pyflyte-fast-execute", line 8, in <module>
    sys.exit(fast_execute_task_cmd())
  File "/opt/venv/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/opt/venv/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/opt/venv/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/venv/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/opt/venv/lib/python3.8/site-packages/flytekit/bin/entrypoint.py", line 507, in fast_execute_task_cmd
    subprocess.run(cmd, check=True)
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['pyflyte-execute', '--inputs', '<s3://my-s3-bucket/metadata/propeller/flytesnacks-staging-a5dqkgsjw2gf59jzhpgn/n1/data/inputs.pb>', '--output-prefix', '<s3://my-s3-bucket/metadata/propeller/flytesnacks-staging-a5dqkgsjw2gf59jzhpgn/n1/data/0>', '--raw-output-data-prefix', '<s3://my-s3-bucket/test/lh/a5dqkgsjw2gf59jzhpgn-n1-0>', '--checkpoint-path', '<s3://my-s3-bucket/test/lh/a5dqkgsjw2gf59jzhpgn-n1-0/_flytecheckpoints>', '--prev-checkpoint', '""', '--dynamic-addl-distro', '<s3://my-s3-bucket/gp/flytesnacks/staging/NTHIUJ632K3UNCS775EH722XVM======/fast31220425e74d24c7973f254bb8ecf02f.tar.gz>', '--dynamic-dest-dir', '/root', '--resolver', 'flytekit.core.python_auto_container.default_task_resolver', '--', 'task-module', 'get_io_f7', 'task-name', 'get_idv']' died with <Signals.SIGKILL: 9>.
s
Hi @KS Tarun OOM error is thrown due to lack of memory to run the execution (Out Of Memory). In your code, in the task annotation, can you increase the memory specification (mem argument) to a higher number? Say 600 Mi or something?
@task(…, limits= Resources(mem="600Mi"))
k
What is the max limit possible ?
@Smriti Satyan The reason I'm asking this is because I've tried by increasing it to 1000 Mi. I still get the same error. PS: Some tables have more than 750 columns. And there are more than 10 such tables. For a workflow to run this process smoothly, what should be the memory spec ?
s
That's interesting. @Kevin Su what would the max memory limit be? I'm unsure of the number
Max memory in a demo cluster is set to 1Gi
163 Views