New RunLLM response generated: >>>*# Question* He...
# runllm-flyte-trial
j
New RunLLM response generated: >>>*# Question* Hello, everyone. I'm curious if anyone has encountered a similar use case: 1. I run a pipeline locally, which fills the cache. 2. I then run the same pipeline in the cloud or on another machine, expecting it to utilize the cache from step 1. From what I understand, Flyte uses
diskcache
, which relies on SQLite. However, SQLite isn't ideal for simultaneous access from multiple machines and requires additional workarounds. Here are some thoughts I've come up with: 1. Place the SQLite database on an NFS share (not recommended by SQLite developers, but it might work). 2. Implement PostgreSQL support for
diskcache
. 3. Replace
diskcache
with a caching solution that supports PostgreSQL or another database. None of these solutions are perfect, so I would appreciate any suggestions for a better approach. Thank you! # Answer Flyte's local caching uses
diskcache
, which stores results in a local SQLite database, making it unsuitable for sharing across multiple machines or cloud environments. According to Flyte's documentation, local caching is designed for single-machine use, and remote caching is handled differently, typically using cloud storage solutions like S3 or GCS. To share cache across environments, consider using a remote storage solution for caching. This could involve configuring Flyte to use a cloud-based storage system for caching, which would allow multiple machines to access the same cache. Implementing PostgreSQL support for
diskcache
or replacing it with a solution that supports distributed caching could be complex and may require significant changes to Flyte's caching mechanism. Would you like more details on configuring remote caching or potential alternatives? Sources: - Flyte's caching documentation Slack Link https://.slack.com/archives/CP2HDHKE1/p1729261046.022879 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.