rich-garden-69988
06/08/2023, 9:06 PMfrom flytekit import task
from flytekit.types.file import FlyteFile
@task
def create_map_inputs(
paths: list[str],
spec: dict,
gcs_output_dir: str,
) -> list[dict]:
map_inputs = []
for path in paths:
map_inputs.append(
{
"bam_file": FlyteFile(path),
"spec": spec,
"gcs_output_dir": gcs_output_dir,
}
)
return map_inputs
create_map_inputs(paths=["test.bam"], spec={}, gcs_output_dir="")
The task looks like the following and will be used as input into a map task. Running the above gives the following error:
raise TypeError(
TypeError: Failed to convert return value for var o0 for function create_map_inputs with error <class 'TypeError'>: Object of type FlyteFile is not JSON serializable
Ideally this would return list[dataclass], but I've found that dataclass does not work for my use case (see issue), tuple does not work (Type of Generic List type is not supported, Transformer for type <class 'tuple'> is restricted currently), and any other thing I've tried doesn't work. I have found a way around this, but it's incredibly hacky. Any thoughts?rich-garden-69988
06/08/2023, 9:07 PMdict[str, Any]
this kind of works, but then everything is pickled, and the file path from pickeling is non-deterministic which breaks cache. This also is suboptimal from a UX perspective, as I want to be able to see the inputs going into each map task and if it's pickled, it only shows the pathrich-garden-69988
06/08/2023, 9:10 PMbam_file
key a string, and in the downstream task, hitting the FlyteFile type transformer manually so that it behaves like an actual FlyteFile with something like this:
def convert_string_to_flyte_file_type(
string: str, file_type: type[FlyteFileType] = FlyteFile
) -> FlyteFileType:
"""Converts string to FlyteFile type (or child) so that it acts like FlyteFile.
This allows the custom type transformers of FlyteFile to get run, which enables
downloading.
"""
ctx = FlyteContextManager().current_context()
flyte_file = file_type(string)
transformer = FlyteFilePathTransformer()
lv = transformer.to_literal(
ctx, python_val=flyte_file, python_type=file_type, expected=None
)
transformed_flyte_file = transformer.to_python_value(ctx, lv, file_type)
return transformed_flyte_file
rich-garden-69988
06/08/2023, 9:12 PMglamorous-carpet-83516
06/08/2023, 9:42 PM