1. There is `flytekitplugins-envd` for envd-build...
# ask-the-community
d
1. There is
flytekitplugins-envd
for envd-building, is there an equivalent for dockerfile building? a. specifically, while envd seems nice, we have lots of preexisting Dockerfiles that we would need to translate to envd format, or the dockerfiles are so complicated that we would not be able to recreate them with the flyte-envd api
y
k
you should be able to convert dockerfile to ImageSpec in most of the cases. would you mind sharing one of your dockerfiles
d
Re: Yichen -> No I mean on the ImageSpec side. Ideally we would like to leverage remote-builds to build images on the fly (with a cache) to enable a more quick development cycle. Re Kevin -> I can't share exactly what we use, but... • (1) we sometimes bake important files that are necessary for our simulations into our images ◦ Think ml models, but also more complicated than that. ◦ Another example would be that some programs on first run build a database or have a long init step, something you would want to do on image build, not on container start. I don't see a good way to do that with envd. • to fix (1) -> The envd plugin doesn't have an io.copy interface to fix this • (2) envd seems to only support "--find-links" through a requirements.txt file. This means you can't install jax via ->
pip install --upgrade "jax[cuda12_local]" -f <https://storage.googleapis.com/jax-releases/jax_cuda_releases.html>
◦ caveat -> I tried to test this with envd, but the envd demo fails for me w/
FATA[2023-10-04T06:20:27-04:00] failed to start the envd environment: failed to get the graph from the image: failed to get runtime graph label from image: envd-quick-start:dev
• (3) Although this seems like a nice python-based way to define environments, One of the biggest problems we have with biology-related-software (we are a biotech company) is that new software that comes out of academia almost always have very unique environment requirements. Maybe a perl script is randomly called in the middle of a python script, maybe some hardcoded file path exists, it goes on and on (we've seen it all). ◦ That is to say, we already solve this with docker, our scientists already spend too much time battling this stuff and I cannot see us translating the 100+ dockerfiles we have into this flyte-specific code + learning the unique quirks of envd.
we would like to use remote builds specifically, as some docker images cannot be built on mac, and some images end up being quite large - so having to push them from home connections can be very slow