hello, does anyone know why i am seeing the `cache...
# ask-the-community
hello, does anyone know why i am seeing the
cache was disabled for this task
for a simple python task, when I have the cache enabled:
Copy code
@task(cache=True, cache_version="1.0")
def simple_python_task(name: str):
    print(f"Hello {name}")
I see this error in datacatalog:
Copy code
  "json": {},
  "level": "warning",
  "msg": "Dataset does not exist key: {Project:flytetester Name:flyte_task-simple_python_task Domain:development Version:1.0-Y1uUT6Xg-GKw-c0Pw UUID:}, err missing entity of type Dataset with identifier project:\"flytetester\" name:\"flyte_task-.simple_python_task\" domain:\"development\" version:\"1.0-Y1uUT6Xg-GKw-c0Pw\" ",
  "ts": "2023-02-17T19:01:01Z"
The cache is not being written at all.. would appreciate any pointers on how to debug
Looks like a task must have an output to be cached?
Looks like a task must have an output to be cached?
That is correct.
Thanks! another question, does "fast registration" has any impact on the cache? like technically a code change in the fast registration could also invalidate the cache
or do we rely on cache_version being updated for that
We rely on the
currently. There has been different discussions on using a hash of the function contents to define caching so we could automatically determine changes in the function. But frequently enough, users want to fix a minor bug in the function without invalidating previously cached data.
that makes sense..! i was actually hoping for that to be the case :)
automatically determine is going to be a complex problem and user probably has more context on what invalidates the cache
agreed, a community member also recently contributed to enable cache overwrites and cache deletions for a workflow execution. So I think we have most of the scenarios covered đŸ˜…
we were also going to look into building something for explicitly purging cache.. but seems like we already have something like that available
take a look at https://github.com/flyteorg/flyte/issues/2867. the cache delete functionality is not fully merged yet, but coming very soon!
There's not auto-purging, but there should be an endpoint on admin to delete the cache for a workflow or node execution.
also some kind of possibility of cache invalidation.. like if we can validate whether the output location actually exists in the underlying storage could be useful
Yeah, that could be powerful. There have been a few discussions about cache expirations too - nothing being worked on yet but may be nice to get on the roadmap.
I was wondering if we should throw some kind of warning during compilation if cache directive has no impact on the task with no output
that sounds like a great idea to me! i think it would have to come from flytekit. cc @Yee @Eduardo Apolinario (eapolinario) thoughts on warning if a user defines a task cachable with no outputs (if there are no outputs propeller disables caching).
yeah that’s something we can add for sure.
i can send a pr
oh beautiful
thank you!