acoustic-carpenter-78188
03/21/2023, 10:20 PMthen() closure, an error is reported server side.
When this is run locally, this error is not reported and the output is successfully generated.
The error message when run on-cloud is the following:
failed at Node[n3-n0]. BindingResolutionError: Error binding Var [wf].[dataset], caused by: failed at Node[n0]. CausedByError: Failed to GetPrevious data from outputDir [<s3://union-oc-production-demo/metadata/propeller/zeryx-demo-development-a9vk7kbf8ptkdcdwqrq6/n0/data/0/outputs.pb>], caused by: path:<s3://union-oc-production-demo/metadata/propeller/zeryx-demo-development-a9vk7kbf8ptkdcdwqrq6/n0/data/0/outputs.pb>: not found
The node graph diagram displays the incorrect sequence of operations:
image▾
pyflyte run workflows/example.py train_mnist_model --n_epoch 1
7. then run the following, pointing to your remote cluster of choice: pyflyte --config ~/.uctl/config.yaml run -p your_project -d development --image zeryx1211/mnist _gpu:latest workflows/example.y train_mnist_model --n_epoch1
8. View the failed workflow and execution
9. Rerun the same with working_example from the gist, repeat steps 6-8
Screenshots
No response
Are you sure this issue hasn't been raised already?
☑︎ Yes
Have you read the Code of Conduct?
☑︎ Yes
flyteorg/flyteacoustic-carpenter-78188
03/24/2023, 6:04 PM