Hi all!
Could you please help me - can i use PythonPickledFile between raw containers?
Right now i am trying to use this data type with method described here
https://docs.flyte.org/projects/cookbook/en/latest/auto/core/containerization/raw_container.html and getting
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
when trying to read them by pickle.load(file):
inside
with open(f'{input_dir}/dump_file') as file:
raw_data = pickle.load(file)
it is saved without errors by:
with open(f'{output_dir}/dump_file', 'wb') as handle:
pickle.dump(raw_data, handle)
by previous stage raw container
stage1 = ContainerTask(
name="stage1",
output_data_dir="/var/outputs",
outputs=kwtypes(generation=PythonPickledFile),
image="<http://registry.project.com/flyte/stage1:0.0.1|registry.project.com/flyte/stage1:0.0.1>",
command=[
"python",
"stage1.py",
"/var/outputs",
],
)
stage2 = ContainerTask(
name="stage2",
input_data_dir="/var/inputs",
output_data_dir="/var/outputs",
inputs=kwtypes(dump_file=PythonPickledFile),
outputs=kwtypes(stage2_out=str),
image="<http://registry.project.com/flyte/stage2:0.0.1|registry.project.com/flyte/stage2:0.0.1>",
command=[
"python",
"stage2.py",
"/var/inputs",
"/var/outputs",
],
)
@workflow
def my_wf() -> str:
generation = stage1()
return stage2(dump_file=generation)
Can i pass PythonPickledFile like that?