Hi! I saw on the website that experiment tracking and data lineage is on the way. Is there anywhere this progress can be tracked? I’m working at a company that’s just making a switch from determined.ai to Flyte for orchestration. However, this now means we are missing three important pieces which are experiment tracking/model registry/data versioning. Wondering if union.ml could be a good fit if it’s not too far away on the horizon?
Hi, unionml community! I'm trying to run the MNIST: Digits Classification Tutorial in my local Flyte cluster, but I'm running into issues. Specifically, I hope to train, run batch inference, and serve the model via FastAPI within my local Flyte Sandbox.I followed these steps:1. I set up a local sandbox environment by running
flytectl demo start
2. I ensured my
is pointed at my local environment.
3. I started a unionml project by running
unionml init mnist
and navigating into the directory.
4. I ran
unionml deploy app:model
to deploy a Model object named "model" in the app.py file within the current working directory.
5. I ran
unionml train app:model
to train the model.
When I deploy the model, I see these logs:
[unionml] deploying app:model
2022-09-04 19:09:54,211 unionml INFO Building docker container in flyte demo cluster.
2022-09-04 19:09:54,862 unionml INFO unable to prepare context: unable to evaluate symlinks in Dockerfile path: lstat /root/Dockerfile: no such file or directory
2022-09-04 19:09:54,895 unionml INFO Deploying workflow digits_classifier.train
2022-09-04 19:09:55,230 unionml INFO Deploying workflow digits_classifier.predict
2022-09-04 19:09:55,294 unionml INFO Deploying workflow digits_classifier.predict_from_features
When I train the model, I see that a workflow is triggered in the sandbox Flyte Console, but the workflow fails on the
node. The logs for that task are:
[1/1] currentAttempt done. Last Error: USER::containers with unready status: [ffa8038a9524b4920a61-n0-0]|Back-off pulling image "unionml:digits-classifier-a31c8bb4d7f235f31b25a3449e3dcc0c1f82e9be"
It looks like I'm getting an error with finding pulling an image, but I'm not sure what I'm doing wrong 😅. Could you advise on how I can get this working?
2 weeks ago
When I execute a flyte workflow, do the nodes in that workflow run in separate Kubernetes pods? If so, How can I adjust the amount of cpu/memory I allocate to a particular kubernetes pod?
2 weeks ago
<!here> hey all! 👋 I just created this proposal for a