Hi, unionml community! I'm trying to run the <MNIS...
# ecosystem-unionml
Hi, unionml community! I'm trying to run the MNIST: Digits Classification Tutorial in my local Flyte cluster, but I'm running into issues. Specifically, I hope to train, run batch inference, and serve the model via FastAPI within my local Flyte Sandbox. I followed these steps: 1. I set up a local sandbox environment by running
flytectl demo start
, 2. I ensured my
is pointed at my local environment. 3. I started a unionml project by running
unionml init mnist
and navigating into the directory. 4. I ran
unionml deploy app:model
to deploy a Model object named "model" in the app.py file within the current working directory. 5. I ran
unionml train app:model
to train the model. When I deploy the model, I see these logs:
Copy code
[unionml] deploying app:model
2022-09-04 19:09:54,211 unionml INFO Building docker container in flyte demo cluster.
2022-09-04 19:09:54,862 unionml INFO unable to prepare context: unable to evaluate symlinks in Dockerfile path: lstat /root/Dockerfile: no such file or directory
2022-09-04 19:09:54,895 unionml INFO Deploying workflow digits_classifier.train
2022-09-04 19:09:55,230 unionml INFO Deploying workflow digits_classifier.predict
2022-09-04 19:09:55,294 unionml INFO Deploying workflow digits_classifier.predict_from_features
When I train the model, I see that a workflow is triggered in the sandbox Flyte Console, but the workflow fails on the
node. The logs for that task are:
Copy code
[1/1] currentAttempt done. Last Error: USER::containers with unready status: [ffa8038a9524b4920a61-n0-0]|Back-off pulling image "unionml:digits-classifier-a31c8bb4d7f235f31b25a3449e3dcc0c1f82e9be"
It looks like I'm getting an error with finding pulling an image, but I'm not sure what I'm doing wrong 😅. Could you advise on how I can get this working?
So the image is built locally but not available to the demo kubernetes
A solution is to use the —source Flag to mount the source into the demo Environment and this building thr container inside demo will be accessible
Will share more details later unless cc @Samhita Alla / @Kevin Su can help
yeah, you can build the image inside the container
Copy code
flytectl demo start --source . (in the mnist directory)
flytectl demo exec -- docker build -t mnist:v1 .
@Kevin Su, I don’t think the
docker build …
command is required cause it’s automatically handled by
Question: is there some way I can shorten iteration cycles when writing the code? The flow I was using to make changes was 1. Modify the app.py module. 2. Commit changes to git. 3.
unionml deploy app:model
Step three takes 5-10 minutes, even for very small code changes. Is there some way I can shorten this to <1 minute? I would assume that Docker would cache pre-built and unchanged layers, but it looks like that's not happening.
Yes this can use fast registration- like pyflyte run
Have you tried that, I think unionml can do the same
@Niels Bantilan
@Ryan Delgado sorry i was away from the computer. So basically Flyte has a fast mode - called Fast register. this essentially skips building the image and simply moves the code artifact directly into the container. UnionML can do the same. If this is a blocker let me know. I do not think this is hard to do. cc @Eduardo Apolinario (eapolinario) / @Niels Bantilan
if you want to checkout the experience in Flyte - here is the link.
hi @Ryan Delgado, yes so if you follow the Deploying to Flyte guide, you’ll see instructions on how to build the docker image from within the local flyte cluster
Cool. I'll take a look and reply if I have other questions!
also, re: fast registration, I made an issue to track this: https://github.com/unionai-oss/unionml/issues/163
also one quick question @Ryan Delgado: for the 5-10 minutes re-deployment, was that on a
flytectl demo
cluster, or a production cluster?
this is weird, that rebuilding the container took this long
I'm happy to provide any diagnostic logs if it's helpful.
Ohh ya
Would love to jump On a call to help debug
It should not be reinstalling any dependencies and so rebuilding the containers just simply copying the cord
Fast registration is still faster especially if you’re working with the remote Flyte backend
I'm free tomorrow after 3 PM ET
Sure - also we are working on the fast register for unionml
Thank you for raising it
You're so welcome! If you want, I can send a Zoom, just DM me your email
Should be good after 3