<#1735 Use mashumaro to serialize/deserialize data...
# flyte-github
a
#1735 Use mashumaro to serialize/deserialize dataclass Pull request opened by hhcs9527 TL;DR • Use mashumaro to serialize/deserialize dataclass • the bigger task will speed up 1.5x process time in serialize/deserialize, with benchmark workflow (provided below) • performance testing result sheet Type ☐ Bug Fix ☑︎ Feature ☐ Plugin Are all requirements met? ☑︎ Code completed ☐ Smoke tested ☐ Unit tests added ☐ Code documentation added ☐ Any pending items have an associated Issue Complete description 1. Update the setup.py 2. Extend the functionality of DataClassTransformer to support DataClassJSONMixin for better serialize/deserialize speed. 3. Complete code change for serialize/deserialize data class. 4. Compare old and new performance, benchmark workflow
Copy code
from flytekit import task, workflow, Resources
from typing import List, Tuple
from dataclasses_json import dataclass_json
from dataclasses import dataclass
from mashumaro.mixins.json import DataClassJSONMixin
from enum import Enum

@dataclass
class CurrencyPosition(DataClassJSONMixin):
    currency: str
    balance: float

@dataclass
class StockPosition(DataClassJSONMixin):
    ticker: str
    name: str
    balance: int

@dataclass
class OP(DataClassJSONMixin):
    currencies: List[CurrencyPosition]
    stocks: List[StockPosition]

@task(requests=Resources(mem="1Gi",cpu="1"),limits=Resources(mem="1Gi",cpu="1"),disable_deck=False)
def t1(obj: OP) -> OP:
    return obj

@task(requests=Resources(mem="1Gi",cpu="1"),limits=Resources(mem="1Gi",cpu="1"),disable_deck=False)
def t2(obj: OP) -> OP:
    return obj

@task(requests=Resources(mem="1Gi",cpu="1"),limits=Resources(mem="1Gi",cpu="1"),disable_deck=False)
def t3(obj: OP) -> OP:
    return obj

@workflow
def wf() -> OP:
    my_op = OP(
        currencies=[
            CurrencyPosition("USD", 238.67),
            CurrencyPosition("EUR", 361.84),
        ],
        stocks=[
            StockPosition("AAPL", "Apple", 10),
            StockPosition("AMZN", "Amazon", 10),
        ]
    )

    res = t1(obj=my_op)
    res = t2(obj=res)
    res = t3(obj=res)

    return res
Performance Compare • We compare three items by running the three items 3 times for each then find the avergae time of to_literal and to_python val • the result shows, we reach 1.5x faster than previous version of DataClassTransformer with DataclassJSONMixin as input 1. gcp_dataclass_json_new, test the dataclass_json version of workflow (by simply change the input) with new version of DataClassTransformer 2. gcp_dataclassJSONMixin, test the DataclassJSONMixin version of workflow (by simply change the input) with new version of DataClassTransformer 3. gcp_dataclass_json_old, test the dataclass_json version of workflow (by simply change the input) with older version of DataClassTransformer Compare to_literal

image

Compare to_python val

image

Tracking Issue flyteorg/flyte#3364 flyteorg/flytekit All checks have passed 2/2 successful checks