Hi! We are trying to create a container as explai...
# ask-the-community
k
Hi! We are trying to create a container as explained in: https://docs.flyte.org/projects/cookbook/en/stable/getting_started/creating_flyte_project.html We use exactly the some Dockerfile as in the documentation. The only difference in our example.py file is that we are trying to output pandas dataframe. The code we are trying to run is pasted below:
Copy code
import pandas as pd
from flytekit import task, workflow

@task()
def my_task()->pd.DataFrame:
    return pd.DataFrame([{'a':1,'b':2}, {'a':1,'b':-1}])

@workflow
def wf():
    df = my_task()
This code works just fine when we run it as
pyflyte run workflows/example.py wf
in the host. The problem is that when we shell into the container and try to run the same code, the temp files don't work as they are supposed to. The way I understand it, is that flyte should by default create the raw and sandbox temp-files in the
/tmp/
directory of the container, but it only creates the
sandbox
folder and instead the
raw
folder is created in the same folder where the script is run under
file:/
folder. So flyte creates the raw files in one folder and tries to read it from another. Weirdest thing is that this problem did occur before. We have tried with two different computers. How can we configure the flyte inside our container to run properly? We have used different versions of Python and Flyte. Many thanks in advance!
d
is there a specific error you're running into? I'm getting a little lost in your details. I ran the above code and it seems to work fine. Perhaps you're thinking about the
--destination-dir
option of
pyflyte run
(default
/root
)?
s
Hello thanks for the answer and effort! Coworker of kaurim here. We got the problem fixed and it was a compatibility issue with fsspec, gcsfs and s3fs libraries. It got fixed by downgrading those three libraries to one version down.