I m would like to specify cache version in a file similarly Flyte #flyte-support

I'm would like to specify cache version in a file,...

jolly-nail-18749

02/19/2024, 9:22 AM

I'm would like to specify cache version in a file, similarly to how docker images are specified in config.yaml in the last example on this page: https://docs.flyte.org/en/latest/flytesnacks/examples/customizing_dependencies/multi_images.html. Is this possible? I tried setting the cache version to the docker version like this:

my_config = {

"cache": True,

"container_image": "{{.<http://image.my|image.my>_image.fqn}}:{{.<http://image.my|image.my>_image.version}}",

"cache_version": "{{.<http://image.my|image.my>_image.version}}",

@task(**my_config)

def my_task(...):

...

but it doesn't work because only the container_image field is evaluated. The cache_version if field is simply parsed as string. Is there a way to propagate the cache version from the config.yaml file?

thankful-minister-83577

02/21/2024, 3:14 AM

i’d avoid a config file if you can

thankful-minister-83577

02/21/2024, 3:14 AM

and i think it makes more sense to not tie it to the docker version tbh.

thankful-minister-83577

02/21/2024, 3:14 AM

since your code is likely to change faster than the docker image.

thankful-minister-83577

02/21/2024, 3:15 AM

what’s the objective here?

thankful-minister-83577

02/21/2024, 3:17 AM

i believe it’s possible to add this in, since the image version is known pretty early in the compilation cycle, but what are you saying? every time the image changes you want to invalidate cache?

jolly-nail-18749

02/21/2024, 8:43 AM

I want to have the option to invalidate and thereby regenerate the cache without manually changing the cache_version string in the code (for nightly runs, integration tests, et.c.). Preferably, it shouldn't be tied to the docker version but it would be nice to be able to specify it in a similar manner.

thankful-minister-83577

02/21/2024, 6:43 PM

that doesn’t exist at the pyflyte layer.

thankful-minister-83577

02/21/2024, 6:43 PM

it just feels a bit weird to add it as a core feature i feel.

thankful-minister-83577

02/21/2024, 6:44 PM

a possible workaround would be to set the cache version to a variable somewhere controlled by an environment variable.

thankful-minister-83577

02/21/2024, 6:44 PM

this will give you the control you want without introducing anything too heavy-handed

jolly-nail-18749

02/22/2024, 11:45 AM

I think the workaround you suggest will work and I fully agree that the cache version shouldn't be tied to the docker version. But I still think it's desirable to enable convenient injection of the cache version into a run. Caches will be invalidated every now and then, and to be able to trigger automatic cache regeneration is critical when sharing cache in larger team, imo.

thankful-minister-83577

02/22/2024, 5:16 PM

you can already re-run a cache with the same version.

thankful-minister-83577

02/22/2024, 5:16 PM

you just can’t change the version string.

3 Views

Open in Slack

Previous Next