faint-monitor-96441
09/20/2022, 12:40 PMfaint-monitor-96441
09/20/2022, 12:52 PMHashMethod
, would this work?freezing-airport-6809
broad-monitor-993
09/20/2022, 2:16 PMAnnoteted
types to compute a hash of incoming data so you don’t have to manually pass an md5 hash to the task, see here:
https://docs.flyte.org/projects/cookbook/en/latest/auto/core/flyte_basics/task_cache.html#caching-of-non-flyte-offloaded-objectsfaint-monitor-96441
09/21/2022, 7:38 AMstr
type. It doesn’t work though, perhaps because str
is a primitive?faint-monitor-96441
09/21/2022, 8:02 AM@task
def hash_dataset_function(dataset_name: str) -> str:
return hashlib.md5(
open(f"data/dataset/{dataset_name}.dvc", "rb").read()
).hexdigest()
@task
def get_dataset_name(process: str) -> Annotated[str, HashMethod(hash_dataset_function)]:
return process
@task(cache=True,cache_version="1.0")
def cached_task(dataset_name: str) -> float:
...
@workflow
def wf():
dataset_name = get_dataset_name(process=process)
always_cached = cached_task(dataset_name)
faint-monitor-96441
09/21/2022, 9:44 AMfaint-monitor-96441
09/21/2022, 9:52 AM@dataclass @dataclass_json
which has the md5 checksum. If the other approach is supposed to work, let me know.freezing-airport-6809
freezing-airport-6809
freezing-airport-6809
freezing-airport-6809
thankful-minister-83577
thankful-minister-83577
thankful-minister-83577
thankful-minister-83577
high-accountant-32689
09/23/2022, 12:17 AMhigh-accountant-32689
09/23/2022, 9:42 PMhash_dataset_function
with @task
). You can test it out by installing flytekit from master or waiting for the next release (which should happen about 1 week from now)