Is it possible to register a `TypeTransformer` for...
# flytekit
r
Is it possible to register a
TypeTransformer
for one of our own dataclasses that has its own to/from_json implementation instead of using
@dataclass_json
? Curious if it'll check for an appropriate transformer before the dataclass_json code path
y
should be yeah
r
This worked great, we actually have our own custom codegen tool that generates pydantic dataclasses from openapi specifications, and we just added a
TypeTransformer
for those. Now we can codegen all the classes needed for transport between stages & they can mirror our model serving API spec!
Hm, I might have spoken too soon -- serialization worked and I didn't check back in on the workflow, the workflow wound up failing at runtime w/ this error:
Copy code
Failed to get data from /tmp/flyte-mab8jbye/raw/50b3523f07b8ba107ebdcdd288ec4f44 to /tmp/flytet8shuc6s/local_flytekit/92fbf638a67c41ecbeedde42c6ece98d (recursive=False).

Original exception: [Errno 2] No such file or directory: '/tmp/flyte-mab8jbye/raw/50b3523f07b8ba107ebdcdd288ec4f44'
I tested a variant that adapts the
NumpyArray
transformer as well as the example from the docs here, https://docs.flyte.org/projects/cookbook/en/latest/auto/core/extend_flyte/custom_types.html#advanced-custom-types. Will pick this back up tmrw but curious if you've seen a similar error before/have any suggestions
This is the implementation
Copy code
class FeatureClassTransformer(type_engine.TypeTransformer[Features]):
    """
    FlyteKit TypeTransformer for Features. In Flyte, dataclasses typically
    require using dataclass_json for transport unless a TypeTransformer
    is explicitly specified.
    """

    _TYPE_INFO = flytekit.BlobType(
        format="Features",
        dimensionality=flytekit.BlobType.BlobDimensionality.SINGLE,
    )

    def __init__(self):
        super().__init__(name="Features", t=Features)

    def get_literal_type(self, t: Type[Features]) -> types.LiteralType:
        return flytekit.LiteralType(blob=self._TYPE_INFO)

    def to_literal(
        self,
        ctx: context_manager.FlyteContext,
        python_val: Features,
        python_type: Type[Features],
        expected: types.LiteralType,
    ) -> flytekit.Literal:
        local_file = ctx.file_access.get_random_local_path() + ".json"
        with open(local_file, "w") as f:
            json.dump(dataclasses.asdict(python_val), f)
        remote_file = ctx.file_access.get_random_remote_path()
        ctx.file_access.upload(local_file, remote_file)
        return flytekit.Literal(
            scalar=flytekit.Scalar(
                blob=flytekit.Blob(
                    uri=remote_file,
                    metadata=flytekit.BlobMetadata(type=self._TYPE_INFO),
                )
            )
        )

    def to_python_value(
        self,
        ctx: context_manager.FlyteContext,
        lv: literals.Literal,
        expected_python_type: Type[Features],
    ) -> Features:
        local_file = ctx.file_access.get_random_local_path()
        ctx.file_access.download(remote_path=lv.scalar.blob.uri, local_path=local_file)
        with open(local_file, "r") as f:
            return Features.from_json(json.load(f))
s
@Rahul Mehta, would you mind sharing your code snippet with us?
r
Sure, I can put together a minimal repro tmrw or day after.
156 Views