Mike Ossareh
05/10/2023, 2:45 PMrequests.memory=2Gi, limits.memory=2Gi
under flytekit 1.4.2 fail under 1.5.0
We bumped these tasks to have requests.memory=64Gi, limits.memory=64Gi
and they succeed under 1.5.0.
Here are two graphs that illustrate the difference in RAM usage. The k8s request differences are listed on the graphs. Everything else (inputs, etc) are the same. The only differences are one flytekit 1.4.2 vs 1.5.0.
What changed?Ketan (kumare3)
Mike Ossareh
05/10/2023, 3:04 PMKetan (kumare3)
Mike Ossareh
05/10/2023, 3:06 PMThomas Blom
05/10/2023, 3:07 PMdownload
the data to the compute node. But we aren't even getting to the downstream task. The setting of the folder on the FlyteDirectory appears to be causing something, presumably the large amount of data, to get loaded into memory.Ketan (kumare3)
Thomas Blom
05/10/2023, 3:09 PMYee
Thomas Blom
05/10/2023, 9:38 PM>>> import fsspec
>>> fsspec.__version__
'2023.5.0'
I'm not sure how to get the version of s3fs
, and I don't know if we are using flytekit-data-fsspec
- I don't see this in our dependencies.Mike Ossareh
05/10/2023, 9:39 PMroot@078cd129a8a3:/app# pip list | grep -E '(fsspec|flytekit|s3fs)'
flytekit 1.4.2
flytekitplugins-pod 1.4.2
fsspec 2023.5.0
Yee
Mike Ossareh
05/10/2023, 9:40 PMroot@078cd129a8a3:/app# pip show s3fs
WARNING: Package(s) not found: s3fs
Yee
which aws
? (in the container)Thomas Blom
05/10/2023, 9:41 PMroot@e28c57ef87a7:/app# pip list | grep -E '(fsspec|flytekit|s3fs)'
flytekit 1.5.0
flytekitplugins-pod 1.5.0
fsspec 2023.5.0
s3fs 2023.5.0
Mike Ossareh
05/10/2023, 9:42 PMroot@078cd129a8a3:/app# which aws
/app/venv/bin/aws
root@078cd129a8a3:/app# aws --version
aws-cli/1.27.132 Python/3.9.16 Linux/5.10.178-162.673.amzn2.x86_64 botocore/1.29.132
Yee
flytekit-data-fsspec
plugin, you would default to the aws cli.Thomas Blom
05/10/2023, 9:45 PMYee
import time
import subprocess
from flytekit import task, workflow, Resources
from flytekit.types.directory import FlyteDirectory
@task(requests=Resources(mem="1Gi"), limits=Resources(mem="1Gi"))
def waiter_task(a: int) -> str:
if a == 0:
time.sleep(86400)
else:
time.sleep(a)
return "hello world"
@task(requests=Resources(mem="1Gi"), limits=Resources(mem="1Gi"))
def dd_and_upload() -> FlyteDirectory:
command = ["dd", "if=/dev/random", "of=/root/temp_10GB_file", "bs=1", "count=0", "seek=10G"]
subprocess.run(command)
return FlyteDirectory("/root/temp_10GB_file")
@workflow
def waiter(a: int = 0) -> str:
return waiter_task(a=a)
@workflow
def uploader() -> FlyteDirectory:
return dd_and_upload()
Thomas Blom
05/10/2023, 10:28 PMYee
Thomas Blom
05/10/2023, 10:42 PMYee
Thomas Blom
05/10/2023, 10:44 PMMike Ossareh
05/11/2023, 12:06 AMThomas Blom
05/11/2023, 2:16 AMKetan (kumare3)
Thomas Blom
05/11/2023, 2:49 PMYee
Thomas Blom
07/11/2023, 5:45 PMc = Config( folder='', size=1, mem=1, many_files=True )
When folder is blank, it just creates a 'test' folder under current_context().working_directory
.
The other params mean we'll write/upload 1GB worth of files, the POD will request 1G memory, and we'll write many smaller files instead of one large one.Yee
import fsspec
target_bucket = "<s3://my-bucket/yt/memtest1>"
container_dir = "/tmp/flyte-ox9aa6ku/sandbox/local_flytekit/e69fd8f684d1e5f02eadd7f427aeb2d8/test"
fs = fsspec.filesystem("s3")
fs.put(container_dir, target_bucket, recursive=True)
Thomas Blom
07/12/2023, 1:40 AMYee
Thomas Blom
07/12/2023, 6:08 PMYee
Thomas Blom
07/12/2023, 6:10 PMYee
Thomas Blom
07/12/2023, 6:13 PMYee
Thomas Blom
07/12/2023, 6:16 PMYee
import fsspec.config
fsspec.config.conf["gather_batch_size"] = 100
Thomas Blom
07/14/2023, 2:58 PMYee
Thomas Blom
10/20/2023, 7:24 PMKevin Su
10/20/2023, 7:47 PMThomas Blom
10/20/2023, 11:47 PMimport fsspec.config
fsspec.config.conf["gather_batch_size"] = 100 # or whatever, we keep reducing it!
2. Re: "s3fs will try to read all the files in the directory into memory by default" -- I still don't understand (and haven't studied the code) the need to load all files to memory, even in small batch sizes -- this is an odd pattern for a file copy, isn't it? What if you had huge files that exceed memory size? This is what has been so confusing about this issue all along -- the large memory requirement for just copying files.Ketan (kumare3)