Joe Kelly
11/06/2023, 9:01 PM@dataclass_json @dataclass
decorators and extending DataClassJsonMixin
@dataclass_json
@dataclass
class Document(DataClassJsonMixin):
url: str
id: int
flag: bool
def __init__(self, url: str, id: int, flag: bool):
self.url = url
self.id = id
self.flag = flag
and we have a task that takes in a `list[Document]`and has cache
set to True
.
Currently, as expected, if I run that task on a Document
and then run it on another Document
with the same properties, it will read from the cache for the second run of the task. What I'd like to know is: is there any way to specify a different hashing method (e.g., in my case really all I care about is the url
property -- if the id
value is different but url
is still the same, I would like to read from the cache rather than rerunning my task)?
I know the ability to set annotations for a specific hashing function are mentioned here, but if I try to add those annotations to this class I get the error Flytekit does not currently have support for FlyteAnnotations applied to Dataclass.Type
Samhita Alla
Joe Kelly
11/07/2023, 4:51 PMTypeTransformer
for my class, which allowed me to set annotations for a given hashing function as desired. I had to explicitly write out my transformer functions, but for a simple class like this that wasn't too much effort -- would still be nice to have the ability to set a specific hashing function on a dataclass_json
thoughSamhita Alla
dataclass_json
anymore. Here's an example: https://flyte.org/blog/flyte-1-10-monorepo-new-agents-eager-workflows-and-more#mashumaro-to-serializedeserialize-dataclassesJoe Kelly
11/08/2023, 8:32 PMmashumaro
DataClassJSONMixin
, I'm still not able to set an annotation for a specific hashing function (so I'm still forced to write my own TypeTransformer if I want to only have my hash take into account some subset of the object's properties)