Hi all, I have a task where I'd like to return a h...
# ask-the-community
g
Hi all, I have a task where I'd like to return a heterogenous dict (doesn't have to be dict, it can be anything - so far I've tried tuple, namedtuple, dataclass, and dict - haven't been able to get any to work unfortunately)
Copy code
from flytekit import task
from flytekit.types.file import FlyteFile


@task
def create_map_inputs(
    paths: list[str],
    spec: dict,
    gcs_output_dir: str,
) -> list[dict]:
    map_inputs = []
    for path in paths:
        map_inputs.append(
            {
                "bam_file": FlyteFile(path),
                "spec": spec,
                "gcs_output_dir": gcs_output_dir,
            }
        )
    return map_inputs


create_map_inputs(paths=["test.bam"], spec={}, gcs_output_dir="")
The task looks like the following and will be used as input into a map task. Running the above gives the following error:
Copy code
raise TypeError(
TypeError: Failed to convert return value for var o0 for function create_map_inputs with error <class 'TypeError'>: Object of type FlyteFile is not JSON serializable
Ideally this would return list[dataclass], but I've found that dataclass does not work for my use case (see issue), tuple does not work (Type of Generic List type is not supported, Transformer for type <class 'tuple'> is restricted currently), and any other thing I've tried doesn't work. I have found a way around this, but it's incredibly hacky. Any thoughts?
If I type the dictionary
dict[str, Any]
this kind of works, but then everything is pickled, and the file path from pickeling is non-deterministic which breaks cache. This also is suboptimal from a UX perspective, as I want to be able to see the inputs going into each map task and if it's pickled, it only shows the path
my "hack" involves making
bam_file
key a string, and in the downstream task, hitting the FlyteFile type transformer manually so that it behaves like an actual FlyteFile with something like this:
Copy code
def convert_string_to_flyte_file_type(
    string: str, file_type: type[FlyteFileType] = FlyteFile
) -> FlyteFileType:
    """Converts string to FlyteFile type (or child) so that it acts like FlyteFile.

    This allows the custom type transformers of FlyteFile to get run, which enables
    downloading.
    """
    ctx = FlyteContextManager().current_context()
    flyte_file = file_type(string)
    transformer = FlyteFilePathTransformer()
    lv = transformer.to_literal(
        ctx, python_val=flyte_file, python_type=file_type, expected=None
    )
    transformed_flyte_file = transformer.to_python_value(ctx, lv, file_type)
    return transformed_flyte_file
in reality, what I'm dealing with is children of FlyteFile that I've created to have specific downloading/uploading functionality that is different than FlyteFile so being able to work with the actual type transformers that are registered would be ideal...
k
@Greg Gydush I think only your pr can fix it. Let me add a pickle transformer first.
152 Views