Am I correct in understanding that Flyte does not support cu Flyte #flyte-support

Am I correct in understanding that Flyte does not ...

melodic-magician-71351

02/21/2023, 4:09 PM

Am I correct in understanding that Flyte does not support custom encoders and decoders for dataclasses_json.

Copy code

class MyDataClass:
     foo: Foo = field(metadata=dict(encoder=encoder, decoder=decoder))

Is the only alternative to define and register a custom

TypeTransformer

? What's the logic behind requiring something comparatively cumbersome?

melodic-magician-71351

02/21/2023, 4:22 PM

I ask because I find myself writing the following pattern a lot, which the above is essentially sugar for.

Copy code

class MyDataClass:
   foo_str: str

   def __post_init__(self):
       self.foo = decoder(self.foo_str)

   @classmethod
   def from_foo(cls, foo: Foo):
       return cls(foo_str=encoder(foo))

tall-lock-23197

02/22/2023, 5:08 AM

The field type, in this case

Foo

, has to be a valid Flyte type. If that isn't the case, I believe the data gets pickled. @glamorous-carpet-83516, could you please confirm?

glamorous-carpet-83516

02/22/2023, 5:55 AM

no, we can’t use pickle, because it’s not dataclass or python primitive type. I think we can support custom encoder/decoder here. just need to update the dataclass transformer here. mind create a ticket, and share your encoder/decoder code

glamorous-carpet-83516

02/22/2023, 5:56 AM

[flyte-core]

user

02/22/2023, 5:56 AM

⭐ Create a new Flyte Core Feature issue: https://github.com/flyteorg/flyte/issues/new?assignees=&labels=enhancement%2Cuntriaged&template=feature_request.yaml&title=%5BCore+feature%5D+

melodic-magician-71351

02/22/2023, 8:59 AM

@glamorous-carpet-83516 Done https://github.com/flyteorg/flyte/issues/3359 I think this is super straightforward

melodic-magician-71351

02/22/2023, 9:01 AM

AFAIK the default encoder and decoder (without additional metadata), assumes a serialization to string. If I understand correctly, you can also specify

mm_field

to specify an alternate intermediate marshmallow type in the schema, which then needs to match up with the return/call signature of your encoder/decoder respectively.

melodic-magician-71351

02/22/2023, 9:01 AM

But I haven't tried that

melodic-magician-71351

02/22/2023, 9:13 AM

If you're asking what I personally use this for, I like to use it for serializing type and function objects. So for example:

Copy code

def import_from_str(obj_qual_name: str):
    module_str, obj_qual_name = obj_qual_name.rsplit(sep='.', maxsplit=1)
    module = importlib.import_module(module_str)
    return getattr(module, obj_qual_name)


def full_name_from_obj(obj) -> str:
    return f'{inspect.getmodule(obj).__name__}.{obj.__qualname__}'


TYPE_SERIALIZER = config(
    encoder=full_name_from_obj,
    decoder=import_from_str
)


@dataclass_json
@dataclass
class ModelSpec:
    model: Type[PreTrainedModel]
    preprocessor_factory: Type[PreprocessorFactory] = field(metadata=TYPE_SERIALIZER)
    training_args: Dict = field(default_factory=dict)
    model_weights: PyTorchCheckpoint
    model_args: Dict = field(default_factory=dict)
    weight_preprocessing: Callable = field(metadata=TYPE_SERIALIZER, default=noop)

glamorous-carpet-83516

02/22/2023, 3:26 PM

Thanks for the clarification, will take a look. contributions are welcome too

melodic-magician-71351

02/22/2023, 3:26 PM

Yeah, I may try to contribute if I can find time, since someone snagged the other feature I was looking for 🙂

🙏 2

glamorous-carpet-83516

02/22/2023, 3:56 PM

Thanks! let me know anything I can help

154 Views

Open in Slack

Previous Next