Hey guys how do i add new packages to the project in flyte i Flyte #flyte-support

Hey guys , how do i add new packages to the projec...

full-market-79972

09/01/2023, 9:56 AM

Hey guys , how do i add new packages to the project in flyte if it is running in aws deployment, for example I have created a project and ran a workflow using pyflyte command, it ran properly, now I want to add a new library lets say opencv to the task and workflow , where do I install the opencv library ? I tried using pip and installed it but when i ran the workflow it says package not found, then I figured out the project is running inside a docker image, now how does a user add the library to the image or is there any other way to fix this problem?

tall-lock-23197

09/01/2023, 11:00 AM

You can use image spec to specify custom dependencies. https://docs.flyte.org/projects/cookbook/en/latest/auto_examples/image_spec/image_spec.html

tall-lock-23197

09/01/2023, 11:01 AM

Or you can also write a Dockerfile, build an image and send it to the relevant pyflyte command.

tall-lock-23197

09/01/2023, 11:01 AM

https://docs.flyte.org/projects/cookbook/en/latest/getting_started/package_register.html#custom-dependencies

full-market-79972

09/01/2023, 12:25 PM

when I use image spec and run pyflyte run --remote --image imagespec.yaml new.py standard_scale_workflow --values '[1.0, 2.0, 3.0, 4.0, 5.0]' I get the below error Failed with Unknown Exception <class 'Exception'> Reason: Builder envd is not registered. Builder envd is not registered. can you please let me know how to resolve it

tall-lock-23197

09/01/2023, 1:25 PM

You need to pip install

flytekitplugins-envd

full-market-79972

09/01/2023, 6:23 PM

yes installed it, but the error still continues .

tall-lock-23197

09/03/2023, 10:23 AM

You shouldn't be seeing that error if you install the plugin. You can also specify imagespec in the Python file itself. Can you try that way?

full-market-79972

09/04/2023, 6:53 AM

cv2_image_spec = ImageSpec( base_image = "cr.flyte.org/flyteorg/flytekit:py3.10-1.9.0", packages=["opencv-python"], env={"Debug": "True"} ) if cv2_image_spec.is_container(): import cv2 @task def mean(values: List[float]) -> float: print(cv2.version) return sum(values) / len(values) this is my code and I am running the below command pyflyte run --remote new.py standard_scale_workflow --values '[1.0, 2.0, 3.0, 4.0, 5.0]' and I get the same error given below Failed with Unknown Exception <class 'ModuleNotFoundError'> Reason: No module named 'cv2' No module named 'cv2' Please help me out

full-market-79972

09/04/2023, 7:18 AM

or can you tell me in the imagespec.yaml what exactly needs to be added # imageSpec.yaml python_version: 3.10 registry: pingsutw packages: - sklearn - opencv-python env: Debug: "True" what do i put in place of registry should I leave it the same?

tall-lock-23197

09/04/2023, 11:45 AM

Regarding

No module named 'cv2'

, can you install opencv in you local environment?

full-market-79972

09/05/2023, 6:54 AM

If I install opencv in local env I am not able to find the module when I do pyflyte run --remote .

full-market-79972

09/05/2023, 9:30 AM

I tried creating a new conda env and installed flytekitplugins-envd and then when i run pyflyte run --remote --image imagespec.yaml new.py standard_scale_workflow --values '[1.0, 2.0, 3.0, 4.0, 5.0]' It is building a docker image but I get this error Failed with Unknown Exception <class 'Exception'> Reason: failed to run command envd build --path /tmp/flyte-s5m5ghk_/sandbox/local_flytekit/e3c3143d14bce47e61050188b2791fa7 --platform linux/amd64 --output type=image,name=pingsutw/flytekit:t93nYMZ9tvO68GDt0g1xRg..,push=true with error b'time="2023-09-05T092544Z" level=fatal msg="failed to create the builder: failed to create buildkit client: failed to bootstrap the buildkitd: failed to create container: Error response from daemon: invalid mount config for type \\"bind\\": bind source path does not exist: /home/ngupta/.config/envd"\n' failed to run command envd build --path /tmp/flyte-s5m5ghk_/sandbox/local_flytekit/e3c3143d14bce47e61050188b2791fa7 --platform linux/amd64 --output type=image,name=pingsutw/flytekit:t93nYMZ9tvO68GDt0g1xRg..,push=true with error b'time="2023-09-05T092544Z" level=fatal msg="failed to create the builder: failed to create buildkit client: failed to bootstrap the buildkitd: failed to create container: Error response from daemon: invalid mount config for type \\"bind\\": bind source path does not exist: /home/ngupta/.config/envd"\n' I guess if you help me out with this then my work will be done

tall-lock-23197

09/05/2023, 1:03 PM

This looks more like a Docker-related issue to me.

tall-lock-23197

09/05/2023, 1:03 PM

https://github.com/docker/docs/issues/4709

full-market-79972

09/06/2023, 11:31 AM

can you provide me any links to tutorials where I can run actual ml pipelines like downloading the data , preprocessing, training and evaluating, I would like to know about the data types for tf models , torch models etc in the tasks return values.

tall-lock-23197

09/06/2023, 11:42 AM

https://docs.flyte.org/projects/cookbook/en/latest/ml_training.html -- here are a couple of them. You should be able to return PyTorch modules: https://docs.flyte.org/projects/cookbook/en/latest/auto_examples/data_types_and_io/pytorch_types.html. You should also be able to return TensorFlow models: https://github.com/flyteorg/flytekit/blob/82b409bc91377d3ae14f909c819ab3885f2d3a1d/tests/flytekit/unit/extras/tensorflow/model/test_model.py

full-market-79972

09/07/2023, 6:32 AM

Any tutorial link I can get?

tall-lock-23197

09/07/2023, 6:41 AM

I just shared the link above.

full-market-79972

09/07/2023, 9:50 AM

thanks

full-market-79972

09/07/2023, 9:51 AM

[1/1] currentAttempt done. Last Error: USER::Pod failed. No message received from kubernetes. [f6d65c22f831647dca1e-n0-0] terminated with exit code (247). Reason [OOMKilled]. Message: tar: Removing leading `/' from member names do you have any idea why I get this message while running a task , the task is to download mnist data? the task is failing by the way.

full-market-79972

09/07/2023, 10:09 AM

I know it is because of resource limit but I have 900GB of memory but how do I set the memory for a project ?

tall-lock-23197

09/07/2023, 10:27 AM

Here's how you can set it: https://flyte-org.slack.com/archives/CP2HDHKE1/p1687451749194059?thread_ts=1687422709.004429&cid=CP2HDHKE1

full-market-79972

09/07/2023, 10:49 AM

[1/1] currentAttempt done. Last Error: UNKNOWN::Outputs not generated by task execution and the ouputs I am getting the above error don't know why the output is a tuple of numpy arrays

tall-lock-23197

09/07/2023, 1:13 PM

Oh. Can you share the code?

full-market-79972

09/07/2023, 1:32 PM

never mind I fixed it, thanks ,so is there any way I can mount a pvc to a task, as I have a task which is creating a model and I want to save it so that I can access it in another task, I am on aws and running the flyte inside it/

full-market-79972

09/07/2023, 2:17 PM

if you could let me know how pvc can be attached to the flyte tasks , it would really be helpful

tall-lock-23197

09/08/2023, 10:16 AM

You can use pod template or the pod plugin.

full-market-79972

09/13/2023, 11:13 AM

Hey Samitha, may I know how to pass a private image when running a pyflyte run command , below is the command I am running right now which as a public image pyflyte run --remote --project flytetester --domain development --image dkubex123/my_flyte_image:latest mnist.py mnist_workflow I want to know in case dkubex123/my_flyte_image:latest was private then in that case how do i run the command? how do i pass the registry password and username or any other option

tall-lock-23197

09/13/2023, 12:35 PM

You need to configure image pull secrets: https://docs.flyte.org/projects/cookbook/en/latest/auto_examples/development_lifecycle/private_images.html#private-images.

full-market-79972

09/14/2023, 5:04 AM

Once I create a secret where do I configure it , I mean any specific pod

tall-lock-23197

09/14/2023, 5:16 AM

You will need to configure the secrets in the backend as mentioned in the guide. Aren't you able to?

full-market-79972

09/14/2023, 5:21 AM

in the guide it is not mentioned where to configure,can you guide please

tall-lock-23197

09/14/2023, 5:30 AM

https://docs.flyte.org/projects/cookbook/en/latest/auto_examples/development_lifecycle/private_images.html#configure-imagepullsecrets You will need to add imagepullsecrets to the default or a custom service account and use the same while triggering an execution. Or you can add them to the pod template. Here's an example: https://flyte-org.slack.com/archives/CP2HDHKE1/p1687941857393889 (this doesn't have imagepullsecrets though)

full-market-79972

09/14/2023, 5:52 AM

db-pass Opaque 1 12d flyte-admin-secrets Opaque 4 12d flyte-pod-webhook Opaque 3 12d flyte-secret-auth Opaque 1 12d sh.helm.release.v1.flyte.v1 helm.sh/release.v1 1 12d so these are the secrets in flyte namespace do you think adding the docker hub user and secret in anyone get the work done?

tall-lock-23197

09/14/2023, 5:52 AM

What secrets do you have in the flytesnacks namespace?

full-market-79972

09/14/2023, 12:30 PM

I figured it out I have a doubt when i give cloud watch logs ins aws and when the task runs successfully and when i click on the logs option in the flyte ui I get redirected to aws cloud watch console but I get the below error • There was an error getting log events. • The specified log stream does not exist. I tried in multiple setups and the issue still persists please help

tall-lock-23197

09/14/2023, 1:39 PM

Did you double check the template URI?

full-market-79972

09/15/2023, 4:39 AM

what do you want me to check exactly? It is asking for me to prompt the loggroup name userSettings: accountNumber: accountRegion: dbPassword: rdsHost: bucketName: logGroup: in the logGroup name i give the aws cloud watch name task_logs: plugins: logs: kubernetes-enabled: false # -- One option is to enable cloudwatch logging for EKS, update the region and log group accordingly # You can even disable this cloudwatch-enabled: true # -- region where logs are hosted cloudwatch-region: "{{ .Values.userSettings.accountRegion }}" # -- cloudwatch log-group cloudwatch-log-group: "{{ .Values.userSettings.logGroup }}"

tall-lock-23197

09/15/2023, 7:35 AM

I meant this: https://docs.flyte.org/projects/cookbook/en/latest/auto_examples/productionizing/configure_logging_links.html#how-to-configure

full-market-79972

09/15/2023, 11:31 AM

Samitha, is there anyway I can change the task limits in the configmap dynamically from the task task_resource_defaults.yaml: | task_resources: defaults: cpu: 1000m memory: 15Gi storage: 15Gi limits: cpu: 2 gpu: 1 memory: 1Gi storage: 20Gi I want to set memory over here inside the configmap dynamically

tall-lock-23197

09/15/2023, 1:52 PM

I don't think that's possible. But you can set limits at the task level.

full-market-79972

09/19/2023, 1:36 PM

Hi, how do i specify a package from an image to a workflow, I have created a file and I am using it for tasks but how do i do it for workflows?

tall-lock-23197

09/20/2023, 5:27 AM

Workflow is a DSL. What do you want to do exactly?

full-market-79972

09/27/2023, 7:23 AM

Hey Samitha, @tall-lock-23197 any idea why I am getting below error when i run pyflyte inside an aws cluster RPC Failed, with Status: StatusCode.INTERNAL details: failed to create a signed url. Error: WebIdentityErr: failed to retrieve credentials caused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity status code: 403, request id: 2c17015d-85be-4570-8e30-a8be2b75be3f Debug string UNKNOWN:Error received from peer ipv410.100.178.20881 {created_time:"2023-09-27T072019.126215328+00:00", grpc_status:13, grpc_message:"failed to create a signed url. Error: WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: 2c17015d-85be-4570-8e30-a8be2b75be3f"} ?

tall-lock-23197

09/27/2023, 8:34 AM

Looks like an AWS permissions issue

full-market-79972

09/27/2023, 10:14 AM

do you think it can be an issue with s3 bucket?

tall-lock-23197

09/27/2023, 11:01 AM

Yeah, I believe so.

tall-lock-23197

09/27/2023, 11:01 AM

It'll have to do with the roles you assigned.

full-market-79972

09/28/2023, 10:45 AM

@tall-lock-23197, How do we actually view the output of workflow, I mean inside the project and inside the workflow I am able to see the tasks and it's output , but in workflow I am returning some values how do I see that from console? @workflow def optimize_model(): best_accuracy = optimize_hyp() best_params = {"n_estimators": 10, "max_depth": 5, "min_samples_split": 0.2} # Set to 0 as it's not used in this example final_model_accuracy = train_model(n_estimators=best_params["n_estimators"], max_depth=best_params["max_depth"], min_samples_split=best_params["min_samples_split"]) return best_params, best_accuracy, final_model_accuracy how do i see the return parameters in flyte console?

tall-lock-23197

09/28/2023, 10:54 AM

You can view the outputs of a workflow in the "View inputs and outputs" link present in the navigation bar.

full-market-79972

09/29/2023, 9:43 AM

I get UNKNOW status for the tasks , and I am unable to debug what is the issue, can you please let me know when and why does tasks go into unknown status ? @tall-lock-23197

tall-lock-23197

09/29/2023, 11:23 AM

Have you checked the propeller and admin logs?

full-market-79972

09/29/2023, 12:53 PM

yes when i enable ray in plugins it is happening

tall-lock-23197

09/29/2023, 1:05 PM

Are head and worker nodes spinning up?

full-market-79972

10/02/2023, 7:14 AM

no they are not but I am using normal ray instead of flyte ray plugin, anyhow I wanted to know how do i make sure a task inside workflow goes into execution only after another task in executed, because I want the output of one task to go into another ? Anyway I can do that?

full-market-79972

10/02/2023, 7:15 AM

Thanks

full-market-79972

10/02/2023, 8:56 AM

I fixed it, thanks, but how do I convert Promise(node:n1.o0) to str I mean a task is returning a string and I can see the same in console , but when I call it in workflow it is returning Promise(node:n1.o0). Below is the code can you help me fix it @workflow def optimize_model() -> Tuple[float, str, List[int]]: best_accuracy, best_params = optimize_hyp() # Train the best model with best hyperparameters run_id = train_best_model(best_params=best_params) print(run_id) # Specify the MLflow model URI model_uri = f"runs:/a4519e1f7e00433886646a2bfb51600f/best_random_forest_model" # Sample data for inference (for the Iris dataset) data = [[5.1, 3.5, 1.4, 0.2]] # Perform inference using the ray_inference task inference_results = ray_inference(model_uri=model_uri, data=data) This is the workflow and run_id is what giving me the promise instead of str below is the task code for run_id return best_accuracy, run_id, inference_results @task(requests=Resources(cpu="2", mem="1Gi")) def train_best_model(best_params: Dict[str, Any]) -> str: import mlflow import mlflow.sklearn from sklearn.datasets import load_iris from sklearn.ensemble import RandomForestClassifier import numpy as np iris = load_iris() X, y = iris.data, iris.target # Initialize and train a Random Forest Classifier with the best hyperparameters clf = RandomForestClassifier(n_estimators=best_params["n_estimators"], max_depth=best_params["max_depth"], min_samples_split=best_params["min_samples_split"], random_state=42) accuracy = model_accuracy(n_estimators=best_params["n_estimators"], max_depth=best_params["max_depth"], min_samples_split=best_params["min_samples_split"]) # Fit the model clf.fit(X, y) # Log the model to MLflow with mlflow.start_run(run_name="BestRandomForestModel") as run: mlflow.sklearn.log_model(clf, "best_random_forest_model") mlflow.log_metric("accuracy", accuracy) run_id = mlflow.active_run().info.run_id return str(run_id)

tall-lock-23197

10/02/2023, 8:57 AM

Task outputs in a workflow are promises. You need to send them to another task to materialize the promises.

full-market-79972

10/02/2023, 9:22 AM

any example on how we can do that, can you refer me any link

full-market-79972

10/02/2023, 10:12 AM

I fixed it thanks

👍 1

full-market-79972

10/02/2023, 3:26 PM

Hey @tall-lock-23197 is there anyway we can add resource values in flyte-binary for task section, I see in flyte-core that we can configure task resources like storage memory as well as cpu and gpu.

tall-lock-23197

10/03/2023, 4:57 AM

You can add

task_resources

section here:

Copy code

task_resources:                                                                                                                                                                                        
      defaults:                                                                                                                                                                                             
        cpu: 1                                                                                                                                                                                              
        memory: 4Gi                                                                                                                                                                                         
        storage: 5Gi                                                                                                                                                                                        
      limits:                                                                                                                                                                                                
        cpu: 16                                                                                                                                                                                                                                                                                                                                                                                          
        memory: 16Gi                                                                                                                                                                                         
        storage: 20Gi

full-market-79972

10/05/2023, 7:28 AM

thanks

full-market-79972

10/12/2023, 4:47 PM

Hey @tall-lock-23197 for the below code in flyte I get a certain error import flytekit from flytekit import task, workflow, Resources from typing import List, Tuple @task(requests=Resources(gpu="1", cpu="2", mem="1Gi"),container_image="822795565729.dkr.ecr.us-west-2.amazonaws.com/prime-analysis:prime-analysis-reksi-flyte_pipeline-0.0.10") def sleep(): import time time.sleep(3600) @workflow def toy_workflow(): sleep() I run it using pyflyte run --remote --project cpa-test toy_pipeline.py toy_workflow the error is /opt/nvidia/nvidia_entrypoint.sh: line 67: exec: pyflyte-fast-execute: not found I installed flyte and related packages. pyflyte-fast-execute is in PATH, but it still fails. I am not sure how to debug it, pod fails to start. can you please help me?

full-market-79972

10/12/2023, 5:05 PM

is it something because of the gpu's?

tall-lock-23197

10/13/2023, 7:17 AM

This error usually crops up when the architecture the image is built on isn't the same as the architecture that a container is spun up on. Can you use buildkit to specify the architecture while building your Docker image? You could also use image spec to simplify this process.

789 Views

Open in Slack

Previous Next