Hey guys , how do i add new packages to the projec...
# ask-the-community
n
Hey guys , how do i add new packages to the project in flyte if it is running in aws deployment, for example I have created a project and ran a workflow using pyflyte command, it ran properly, now I want to add a new library lets say opencv to the task and workflow , where do I install the opencv library ? I tried using pip and installed it but when i ran the workflow it says package not found, then I figured out the project is running inside a docker image, now how does a user add the library to the image or is there any other way to fix this problem?
s
Or you can also write a Dockerfile, build an image and send it to the relevant pyflyte command.
n
when I use image spec and run pyflyte run --remote --image imagespec.yaml new.py standard_scale_workflow --values '[1.0, 2.0, 3.0, 4.0, 5.0]' I get the below error Failed with Unknown Exception <class 'Exception'> Reason: Builder envd is not registered. Builder envd is not registered. can you please let me know how to resolve it
s
You need to pip install
flytekitplugins-envd
n
yes installed it, but the error still continues .
s
You shouldn't be seeing that error if you install the plugin. You can also specify imagespec in the Python file itself. Can you try that way?
n
cv2_image_spec = ImageSpec( base_image = "cr.flyte.org/flyteorg/flytekit:py3.10-1.9.0", packages=["opencv-python"], env={"Debug": "True"} ) if cv2_image_spec.is_container(): import cv2 @task def mean(values: List[float]) -> float: print(cv2.version) return sum(values) / len(values) this is my code and I am running the below command pyflyte run --remote new.py standard_scale_workflow --values '[1.0, 2.0, 3.0, 4.0, 5.0]' and I get the same error given below Failed with Unknown Exception <class 'ModuleNotFoundError'> Reason: No module named 'cv2' No module named 'cv2' Please help me out
or can you tell me in the imagespec.yaml what exactly needs to be added # imageSpec.yaml python_version: 3.10 registry: pingsutw packages: - sklearn - opencv-python env: Debug: "True" what do i put in place of registry should I leave it the same?
s
Regarding
No module named 'cv2'
, can you install opencv in you local environment?
n
If I install opencv in local env I am not able to find the module when I do pyflyte run --remote .
I tried creating a new conda env and installed flytekitplugins-envd and then when i run pyflyte run --remote --image imagespec.yaml new.py standard_scale_workflow --values '[1.0, 2.0, 3.0, 4.0, 5.0]' It is building a docker image but I get this error Failed with Unknown Exception <class 'Exception'> Reason: failed to run command envd build --path /tmp/flyte-s5m5ghk_/sandbox/local_flytekit/e3c3143d14bce47e61050188b2791fa7 --platform linux/amd64 --output type=image,name=pingsutw/flytekit:t93nYMZ9tvO68GDt0g1xRg..,push=true with error b'time="2023-09-05T092544Z" level=fatal msg="failed to create the builder: failed to create buildkit client: failed to bootstrap the buildkitd: failed to create container: Error response from daemon: invalid mount config for type \\"bind\\": bind source path does not exist: /home/ngupta/.config/envd"\n' failed to run command envd build --path /tmp/flyte-s5m5ghk_/sandbox/local_flytekit/e3c3143d14bce47e61050188b2791fa7 --platform linux/amd64 --output type=image,name=pingsutw/flytekit:t93nYMZ9tvO68GDt0g1xRg..,push=true with error b'time="2023-09-05T092544Z" level=fatal msg="failed to create the builder: failed to create buildkit client: failed to bootstrap the buildkitd: failed to create container: Error response from daemon: invalid mount config for type \\"bind\\": bind source path does not exist: /home/ngupta/.config/envd"\n' I guess if you help me out with this then my work will be done
s
This looks more like a Docker-related issue to me.
n
can you provide me any links to tutorials where I can run actual ml pipelines like downloading the data , preprocessing, training and evaluating, I would like to know about the data types for tf models , torch models etc in the tasks return values.
n
Any tutorial link I can get?
s
I just shared the link above.
n
thanks
[1/1] currentAttempt done. Last Error: USER::Pod failed. No message received from kubernetes. [f6d65c22f831647dca1e-n0-0] terminated with exit code (247). Reason [OOMKilled]. Message: tar: Removing leading `/' from member names do you have any idea why I get this message while running a task , the task is to download mnist data? the task is failing by the way.
I know it is because of resource limit but I have 900GB of memory but how do I set the memory for a project ?
n
[1/1] currentAttempt done. Last Error: UNKNOWN::Outputs not generated by task execution and the ouputs I am getting the above error don't know why the output is a tuple of numpy arrays
s
Oh. Can you share the code?
n
never mind I fixed it, thanks ,so is there any way I can mount a pvc to a task, as I have a task which is creating a model and I want to save it so that I can access it in another task, I am on aws and running the flyte inside it/
if you could let me know how pvc can be attached to the flyte tasks , it would really be helpful
s
You can use pod template or the pod plugin.
n
Hey Samitha, may I know how to pass a private image when running a pyflyte run command , below is the command I am running right now which as a public image pyflyte run --remote --project flytetester --domain development --image dkubex123/my_flyte_image:latest mnist.py mnist_workflow I want to know in case dkubex123/my_flyte_image:latest was private then in that case how do i run the command? how do i pass the registry password and username or any other option
n
Once I create a secret where do I configure it , I mean any specific pod
s
You will need to configure the secrets in the backend as mentioned in the guide. Aren't you able to?
n
in the guide it is not mentioned where to configure,can you guide please
s
https://docs.flyte.org/projects/cookbook/en/latest/auto_examples/development_lifecycle/private_images.html#configure-imagepullsecrets You will need to add imagepullsecrets to the default or a custom service account and use the same while triggering an execution. Or you can add them to the pod template. Here's an example: https://flyte-org.slack.com/archives/CP2HDHKE1/p1687941857393889 (this doesn't have imagepullsecrets though)
n
db-pass Opaque 1 12d flyte-admin-secrets Opaque 4 12d flyte-pod-webhook Opaque 3 12d flyte-secret-auth Opaque 1 12d sh.helm.release.v1.flyte.v1 helm.sh/release.v1 1 12d so these are the secrets in flyte namespace do you think adding the docker hub user and secret in anyone get the work done?
s
What secrets do you have in the flytesnacks namespace?
n
I figured it out I have a doubt when i give cloud watch logs ins aws and when the task runs successfully and when i click on the logs option in the flyte ui I get redirected to aws cloud watch console but I get the below error • There was an error getting log events. • The specified log stream does not exist. I tried in multiple setups and the issue still persists please help
s
Did you double check the template URI?
n
what do you want me to check exactly? It is asking for me to prompt the loggroup name userSettings: accountNumber: accountRegion: dbPassword: rdsHost: bucketName: logGroup: in the logGroup name i give the aws cloud watch name task_logs: plugins: logs: kubernetes-enabled: false # -- One option is to enable cloudwatch logging for EKS, update the region and log group accordingly # You can even disable this cloudwatch-enabled: true # -- region where logs are hosted cloudwatch-region: "{{ .Values.userSettings.accountRegion }}" # -- cloudwatch log-group cloudwatch-log-group: "{{ .Values.userSettings.logGroup }}"
n
Samitha, is there anyway I can change the task limits in the configmap dynamically from the task task_resource_defaults.yaml: | task_resources: defaults: cpu: 1000m memory: 15Gi storage: 15Gi limits: cpu: 2 gpu: 1 memory: 1Gi storage: 20Gi I want to set memory over here inside the configmap dynamically
s
I don't think that's possible. But you can set limits at the task level.
n
Hi, how do i specify a package from an image to a workflow, I have created a file and I am using it for tasks but how do i do it for workflows?
s
Workflow is a DSL. What do you want to do exactly?
n
Hey Samitha, @Samhita Alla any idea why I am getting below error when i run pyflyte inside an aws cluster RPC Failed, with Status: StatusCode.INTERNAL details: failed to create a signed url. Error: WebIdentityErr: failed to retrieve credentials caused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity status code: 403, request id: 2c17015d-85be-4570-8e30-a8be2b75be3f Debug string UNKNOWN:Error received from peer ipv410.100.178.20881 {created_time:"2023-09-27T072019.126215328+00:00", grpc_status:13, grpc_message:"failed to create a signed url. Error: WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: 2c17015d-85be-4570-8e30-a8be2b75be3f"} ?
s
Looks like an AWS permissions issue
n
do you think it can be an issue with s3 bucket?
s
Yeah, I believe so.
It'll have to do with the roles you assigned.
n
@Samhita Alla, How do we actually view the output of workflow, I mean inside the project and inside the workflow I am able to see the tasks and it's output , but in workflow I am returning some values how do I see that from console? @workflow def optimize_model(): best_accuracy = optimize_hyp() best_params = {"n_estimators": 10, "max_depth": 5, "min_samples_split": 0.2} # Set to 0 as it's not used in this example final_model_accuracy = train_model(n_estimators=best_params["n_estimators"], max_depth=best_params["max_depth"], min_samples_split=best_params["min_samples_split"]) return best_params, best_accuracy, final_model_accuracy how do i see the return parameters in flyte console?
s
You can view the outputs of a workflow in the "View inputs and outputs" link present in the navigation bar.
n
I get UNKNOW status for the tasks , and I am unable to debug what is the issue, can you please let me know when and why does tasks go into unknown status ? @Samhita Alla
s
Have you checked the propeller and admin logs?
n
yes when i enable ray in plugins it is happening
s
Are head and worker nodes spinning up?
n
no they are not but I am using normal ray instead of flyte ray plugin, anyhow I wanted to know how do i make sure a task inside workflow goes into execution only after another task in executed, because I want the output of one task to go into another ? Anyway I can do that?
Thanks
I fixed it, thanks, but how do I convert Promise(node:n1.o0) to str I mean a task is returning a string and I can see the same in console , but when I call it in workflow it is returning Promise(node:n1.o0). Below is the code can you help me fix it @workflow def optimize_model() -> Tuple[float, str, List[int]]: best_accuracy, best_params = optimize_hyp() # Train the best model with best hyperparameters run_id = train_best_model(best_params=best_params) print(run_id) # Specify the MLflow model URI model_uri = f"runs:/a4519e1f7e00433886646a2bfb51600f/best_random_forest_model" # Sample data for inference (for the Iris dataset) data = [[5.1, 3.5, 1.4, 0.2]] # Perform inference using the ray_inference task inference_results = ray_inference(model_uri=model_uri, data=data) This is the workflow and run_id is what giving me the promise instead of str below is the task code for run_id return best_accuracy, run_id, inference_results @task(requests=Resources(cpu="2", mem="1Gi")) def train_best_model(best_params: Dict[str, Any]) -> str: import mlflow import mlflow.sklearn from sklearn.datasets import load_iris from sklearn.ensemble import RandomForestClassifier import numpy as np iris = load_iris() X, y = iris.data, iris.target # Initialize and train a Random Forest Classifier with the best hyperparameters clf = RandomForestClassifier(n_estimators=best_params["n_estimators"], max_depth=best_params["max_depth"], min_samples_split=best_params["min_samples_split"], random_state=42) accuracy = model_accuracy(n_estimators=best_params["n_estimators"], max_depth=best_params["max_depth"], min_samples_split=best_params["min_samples_split"]) # Fit the model clf.fit(X, y) # Log the model to MLflow with mlflow.start_run(run_name="BestRandomForestModel") as run: mlflow.sklearn.log_model(clf, "best_random_forest_model") mlflow.log_metric("accuracy", accuracy) run_id = mlflow.active_run().info.run_id return str(run_id)
s
Task outputs in a workflow are promises. You need to send them to another task to materialize the promises.
n
any example on how we can do that, can you refer me any link
I fixed it thanks
Hey @Samhita Alla is there anyway we can add resource values in flyte-binary for task section, I see in flyte-core that we can configure task resources like storage memory as well as cpu and gpu.
s
You can add
task_resources
section here:
Copy code
task_resources:                                                                                                                                                                                        
      defaults:                                                                                                                                                                                             
        cpu: 1                                                                                                                                                                                              
        memory: 4Gi                                                                                                                                                                                         
        storage: 5Gi                                                                                                                                                                                        
      limits:                                                                                                                                                                                                
        cpu: 16                                                                                                                                                                                                                                                                                                                                                                                          
        memory: 16Gi                                                                                                                                                                                         
        storage: 20Gi
n
thanks
Hey @Samhita Alla for the below code in flyte I get a certain error import flytekit from flytekit import task, workflow, Resources from typing import List, Tuple @task(requests=Resources(gpu="1", cpu="2", mem="1Gi"),container_image="822795565729.dkr.ecr.us-west-2.amazonaws.com/prime-analysis:prime-analysis-reksi-flyte_pipeline-0.0.10") def sleep(): import time time.sleep(3600) @workflow def toy_workflow(): sleep() I run it using pyflyte run --remote --project cpa-test toy_pipeline.py toy_workflow the error is /opt/nvidia/nvidia_entrypoint.sh: line 67: exec: pyflyte-fast-execute: not found I installed flyte and related packages. pyflyte-fast-execute is in PATH, but it still fails. I am not sure how to debug it, pod fails to start. can you please help me?
is it something because of the gpu's?
s
This error usually crops up when the architecture the image is built on isn't the same as the architecture that a container is spun up on. Can you use buildkit to specify the architecture while building your Docker image? You could also use image spec to simplify this process.
461 Views