Is there any examples of using ContainerTasks with...
# ask-the-community
Is there any examples of using ContainerTasks with FlyteFile and FlyteDirectory? I’m trying to use a ContainerTask that has a FlyteFile as an input, and the output is a directory but struggling to make anything work. Any pointers?
this should work! do you have code you can share to help debug? what error are you getting?
I’m guessing you’ve seen this page? Are you using python in your container task or another language?
So I think the issue is to due with how I’m defining my File inputs and Directory outputs. I’ve got that example you’ve shared working with no issues, however when using a FlyteFile is seems to struggle pulling the file in.
Copy code
train_task_container = ContainerTask(
This is how the step is defined, so its running a python script but failing because no file is picked up. I’ve attatched the copilot downloader output which seems to show this as well. Again this works fine when using python primitive types for inputs and outputs
what does your script look like?
also, can you share the workflow that uses the
Yep, the script is from the pachyderm examples, that works correctly, and the output shows it has found no input CSV, hence no output, and I’ve ssh’ed into the relevant pod, and
is empty
And the workflow:
I’m basically comparing the ShellTask to the ContainerTask, ShellTask is working fine with the same script
so it looks like
--input /var/inputs
might be the issue here: can you use the templating syntax
--inputs {{.inputs.x}}
instead? Basically the current code points to the
directory whereas
will inject the correct filepath
Thanks for looking at this 🙏
Getting the following error now
Copy code
Pod failed. No message received from kubernetes.
[flyte-copilot-downloader] terminated with ExitCode 0.
[az6rsxswjlzql79xl2bv-n2-0] terminated with exit code (1). Reason [Error]. Message: 
Traceback (most recent call last):
  File "", line 101, in <module>
  File "", line 79, in main
    print("Datasets: {}".format(input_files))
UnboundLocalError: local variable 'input_files' referenced before assignment
[flyte-copilot-sidecar] terminated with ExitCode 0.
And the command executing in container
Copy code
python --input <s3://adarga-ds-lab-2-flyte-data/data/2c/a9j9fgb4l89j8h6t9gfr-n0-0/238d789391a50d97f62e7fcf20e9d42c/boston_housing.csv> --target-col MEDV --output /var/outputs/output
So obviously the s3 address is being passed in instead of the file url
@Michael Tinsley did you ever find a solution to this? I see the same behavior and the only workaround I can think of is to have the task read the file from S3 itself, but that obviously defeats the purpose of Flyte managing the glue
No I didn’t… I got a bit sidetracked - I’ve been meaning to open a GH issue, I’m kind of glad someone else has experienced the same tbh
To add, for my use case I can use the ShellTask instead of a ContainerTask, but I was just trying to compare them. For us, its a question of which is quicker for migrating old pipelines over to Flyte but theres not much in it, just need to add awscli and flykit deps
yeah makes sense, I'm doing similar exploration where in our case we are migrating lots of existing R code
i'll continue down this path and let you know if i find a solution
I found a solution. For
, for a
input variable configured for the directory `/var/inputs`:
=the file
=the remote file uri
Your original attempt was
--input /var/inputs
which wasn't right, should have been
--input /var/inputs/x
Ah great, thanks for working that out - I’ll give it a go today
That works for me when using python primitives, but anything that needs downloading fails. Looking more into this, looks like its the initcontainer that is failing to download from S3, but its returning a 0 exit code which it probably shouldn’t 🤷
I'm also trying to do the same thing, i.e., get remote file into ContainerTask, but still haven't figured out how. I seems the directory
is mounted into
of the docker container, but I can't figure out how to download the file into
with a ContainerTask.
for me they were automatically downloaded into a file named after the variable, so
Hey, that's worked. It's worked for a publicly assessable file, now other people in my company may help figure out how to download internal files on the cloud (but at least the log indicates it tried to download, so that's progress). Thanks!