```list_images = ContainerTask( name="list-ima...
# ask-the-community
d
Copy code
list_images = ContainerTask(
    name="list-images",
    input_data_dir="/var/inputs",
    output_data_dir="/var/outputs",
    inputs=kwtypes(images=List[FlyteFile]),
    outputs=kwtypes(result=FlyteFile),
    image="<http://ghcr.io/flyteorg/rawcontainers-shell:v2|ghcr.io/flyteorg/rawcontainers-shell:v2>",
    command=[
        "/bin/sh",
        "-c",
        "ls /var/inputs > /var/outputs/result",
    ],
)

@workflow
def list_images_wf() -> FlyteFile:
    image_paths = [
        "s3://...",
        "s3://..."
    ]
    images = [FlyteFile(path=p) for p in image_paths]
    result = list_images(images=images)
    return result
When I look at the
result
file, it's empty
d
: / Yes I can confirm that it doesn't work... I can think of many workarounds, but obv you deserve a nice solution.
d
Thanks for taking a look - just wanted to make sure I wasn’t missing something simple
j
i think you need a new task that returns your list of flytefile usually you do not want any logic or looping code in
@workflow
can you create a separate regular task and feed that task into your container task
d
that doesn't work unfortunately
Copy code
import pandas as pd
from flytekit import task, workflow, ContainerTask, kwtypes
from flytekit.types.file import FlyteFile
from flytekit.types.directory import FlyteDirectory

@task()
def my_task(input_data: FlyteDirectory)->pd.DataFrame:
    print(input_data)
    return pd.DataFrame([{'a':1,'b':2}, {'a':1,'b':-1}])


@task()
def create_files()->list[FlyteFile]:
    m_input = []
    for x in range(5):
        with open(f"{x}.txt", "w") as fh:
            fh.write("")
        m_input.append(FlyteFile(f"{x}.txt"))
    return m_input

@task()
def create_file()->FlyteFile:
    m_input = []
    for x in range(5):
        with open(f"{x}.txt", "w") as fh:
            fh.write("")
        m_input.append(FlyteFile(f"{x}.txt"))
    return m_input[1]

@workflow
def wf():
    square_file = ContainerTask(
        name="square_file",
        input_data_dir="/var/inputs",
        output_data_dir="/var/outputs",
        inputs=kwtypes(val=FlyteFile),
        outputs=kwtypes(out=FlyteDirectory),
        image="alpine",
        environment={"a": "b"},
        command=["sh", "-c", "mkdir /var/outputs/out && ls -la /var/inputs/* | tee /var/outputs/out/stdout"],
    )

    square_files = ContainerTask(
        name="square_files",
        input_data_dir="/var/inputs",
        output_data_dir="/var/outputs",
        inputs=kwtypes(val=list[FlyteFile]),
        outputs=kwtypes(out=FlyteDirectory),
        image="alpine",
        environment={"a": "b"},
        command=["sh", "-c", "mkdir /var/outputs/out && ls -la /var/inputs/* | tee /var/outputs/out/stdout"],
    )

    df = my_task(input_data=square_files(val=create_files()))
    df = my_task(input_data=square_file(val=create_file()))
    print(df)
the square_files task gives the stdout:
Copy code
ls: /var/inputs/*: No such file or directory
d
As a workaround, is it possible to create a persistent volume claim and have a plain Python task download the data to it, then mount that volume to my container? Or something like that
s
You can also use dynamic workflow with your container task: https://gist.github.com/pryce-turner/0a67f86febdc812c9a2a9e739c22eeca. I'm not sure if this will resolve your issue, but I recommend you to take a look at it.
As a workaround, is it possible to create a persistent volume claim and have a plain Python task download the data to it
It should be possible with pod templates: https://docs.flyte.org/en/latest/deployment/configuration/general.html#compile-time-podtemplates
d
Thank you @Samhita Alla , I will ask about the pod templates in a separate question. I tried but wasn’t able to get it to work