Hi all, I have a question on controlling how Flyte determines an object's hash for caching. We are using a class with the
@dataclass_json @dataclass
decorators and extending
class Document(DataClassJsonMixin):
    url: str
    id: int
    flag: bool

    def __init__(self, url: str, id: int, flag: bool):
        self.url = url
        self.id = id
        self.flag = flag
and we have a task that takes in a `list[Document]`and has
set to
. Currently, as expected, if I run that task on a
and then run it on another
with the same properties, it will read from the cache for the second run of the task. What I'd like to know is: is there any way to specify a different hashing method (e.g., in my case really all I care about is the
property -- if the
value is different but
is still the same, I would like to read from the cache rather than rerunning my task)? I know the ability to set annotations for a specific hashing function are mentioned here, but if I try to add those annotations to this class I get the error
Flytekit does not currently have support for FlyteAnnotations applied to Dataclass.Type
You're looking at v0.3 docs. Here's the latest: https://docs.flyte.org/projects/cookbook/en/latest/auto_examples/development_lifecycle/task_cache.html#caching-of-non-flyte-offloaded-objects. And regarding custom hashing methods, @Eduardo Apolinario (eapolinario), how should that be implemented?
Ah, thanks, must have gotten to the old docs through google somehow. Curious for the answer on the custom hashing methods!
I believe I found a solution here -- I set up and registered a
for my class, which allowed me to set annotations for a given hashing function as desired. I had to explicitly write out my transformer functions, but for a simple class like this that wasn't too much effort -- would still be nice to have the ability to set a specific hashing function on a
Understood. We've recently integrated mashumaro, and you don't need to use
anymore. Here's an example: https://flyte.org/blog/flyte-1-10-monorepo-new-agents-eager-workflows-and-more#mashumaro-to-serializedeserialize-dataclasses
Got it; however, it seems even with this new
, I'm still not able to set an annotation for a specific hashing function (so I'm still forced to write my own TypeTransformer if I want to only have my hash take into account some subset of the object's properties)