Hello, everyone. I'm curious if anyone has encount...
# flyte-support
s
Hello, everyone. I'm curious if anyone has encountered a similar use case: 1. I run a pipeline locally, which fills the cache. 2. I then run the same pipeline in the cloud or on another machine, expecting it to utilize the cache from step 1. From what I understand, Flyte uses
diskcache
, which relies on SQLite. However, SQLite isn't ideal for simultaneous access from multiple machines and requires additional workarounds. Here are some thoughts I've come up with: 1. Place the SQLite database on an NFS share (not recommended by SQLite developers, but it might work). 2. Implement PostgreSQL support for
diskcache
. 3. Replace
diskcache
with a caching solution that supports PostgreSQL or another database. None of these solutions are perfect, so I would appreciate any suggestions for a better approach. Thank you!
h
If you reallyyy want, the cache service has an API interface. you can replace diskcache with a version that uses gRPC to record results into the remote cache (files will need to be pushed remotely though)... and then running in the cloud will automatically leverage the cache.
s
Thank you, Haytham