Hello everyone! Is there any way of mounting a dat...
# ask-the-community
r
Hello everyone! Is there any way of mounting a data volume in a ContainerTask? I need to provide a directory with an arbitrary number of files, read them, and apply some processing. The way I do it ususally is by mounting a volume when running docker run, but now I’m transforming it to a Flyte pipeline and have had some issues. What I’m trying to do is to pass it as a FlyteDirectory, but it’s not working as I expected. I’m doing it like this:
Copy code
ContainerTask(
    name="my_container",
    image="localhost:30000/my_container:latest",
    input_data_dir="/input",
    output_data_dir="/output",
    inputs=kwtypes(
        config_file=FlyteFile,
        my_dir=FlyteDirectory,),
    outputs=kwtypes(out=FlyteDirectory),
    command=[
        "python app.py --input_dir=/input/my_dir --config=/input/config_file"
    ]
)
I was expecting the
/input
directory to contain a dir called
my_dir
and a file called
config_file
. However, it seems to be empty. Can someone help me figure out how to mount data? Thanks 🙌
p
Hey friend! You might want to take a look at this little gist I wrote after running into similar issues: https://gist.github.com/pryce-turner/0a67f86febdc812c9a2a9e739c22eeca. That being said, I'm afraid your specific use case is a known issue: https://github.com/flyteorg/flyte/issues/3632. Flyte CoPilot can't copy multi-part blobs to arbitrary containers (at this time). May I ask why you're using a ContainerTask here? It looks like it's just Python code that you're running. I would recommend switching this to a regular @task python task and using a custom container, like so:
Copy code
@task(container_image=my_container:latest)
def my_task:
    pass
r
Hey @Pryce, thanks for the answer! I was using ContainerTask because I was cloning a git repo and using the python version and requirements specific to that repo, but now that you mention it I think I can make it work by using a different image for a python task. I’ll give it a try, thank you for the suggestion. It would still be great to be able to mount a FlyteDirectory to a ContainerTask for non-python applications though. I hope that feature is included in a later release.
p
Certainly! Another workaround I explored was using a convenience task to
tar
the dir in question and then passing that archive to ContainerTask as a FlyteFile. It's more overhead but will get you there if you absolutely must use an arbitrary container. Lmk if you need any more help!
106 Views