https://flyte.org logo
#ask-the-community
Title
# ask-the-community
g

Gabriel Molina

05/16/2023, 3:13 PM
Hi! This is such a great project, i'm having problems when i add "from flytekitplugins.spark import Spark" to my *.py after running (remote way, im not having problems using 'pyflyte run *.py my_wf) the process is complete, but the wokflow appears failed with the following message from web UI: [1/1] currentAttempt done. Last Error: USER::trap>", line 688, in _load_unlocked File "", line 883, in exec_module File "", line 241, in _call_with_frames_removed File "/root/flyte.py", line 5, in from flytekitplugins.spark import Spark ModuleNotFoundError: No module named 'flytekitplugins.spark' Traceback (most recent call last): File "/usr/local/bin/pyflyte-fast-execute", line 8, in sys.exit(fast_execute_task_cmd()) ...
k

Ketan (kumare3)

05/16/2023, 4:08 PM
You need to install flytekit plugins spark in the container
But for this join todays community sync - great work by @Kevin Su and @Evan Sadler makes this effortless
g

Gabriel Molina

05/16/2023, 4:17 PM
great! i have it installed in my local python env, how can i install it in the container?
@Kevin Su, @Evan Sadler can u please help me ?
k

Kevin Su

05/16/2023, 5:42 PM
@Gabriel Molina you have to write a dockerfile to build an image. btw, you could try to use image spec. Here is an example to build image for spark task by using image spec. https://github.com/flyteorg/flytekit/pull/1616
g

Gabriel Molina

05/16/2023, 11:17 PM
Thanks @Kevin Su, i tried image spec and it works with run, but not with run --remote.
k

Kevin Su

05/16/2023, 11:26 PM
you have to install envd plugin as well. pip install flytekitplugins-envd==1.6.1
k

Ketan (kumare3)

05/17/2023, 1:13 AM
I think we should include envd in default? And then allow it to be overridden?
s

Samhita Alla

05/17/2023, 4:28 AM
+1 to including it in the flytekit library.
g

Gabriel Molina

05/17/2023, 2:55 PM
@Kevin Su i installed it sucesfully and got this 'failed to build the imageSpec' - autorization failed
s

Samhita Alla

05/18/2023, 3:58 AM
Have you specified the registry? I think you should authenticate to dockerhub from the CLI. https://docs.docker.com/docker-hub/access-tokens/
@Kevin Su, is dockerhub the only supported registry?
@Gabriel Molina, changing the registry name to your username should work!
Kevin, is spark working on the demo cluster?
g

Gabriel Molina

05/18/2023, 1:18 PM
@Samhita Alla im able to run the spark example shared by kevin now with --remote flag! but got the following error in the web UI: [1/1] currentAttempt done. Last Error: USER:core.py760 in invoke │ │ │ │ ❱ 760 │ │ │ │ return __callback(*args, **kwargs) │ │ │ │ /usr/local/lib/python3.10/site-packages/flytekit/bin/entrypoint.py:508 in │ │ fast_execute_task_cmd │ │ │ │ ❱ 508 │ subprocess.run(cmd, check=True) │ │ │ │ /usr/local/lib/python3.10/subprocess.py:526 in run │ │ │ │ ❱ 526 │ │ │ raise CalledProcessError(retcode, process.args, │ ╰──────────────────────────────────────────────────────────────────────────────╯ CalledProcessError: Command '['pyflyte-execute', '--inputs', 's3://my-s3-bucket/metadata/propeller/flytesnacks-development-f91eee73c35a9498e9 c8/n0/data/inputs.pb', '--output-prefix', 's3://my-s3-bucket/metadata/propeller/flytesnacks-development-f91eee73c35a9498e9 c8/n0/data/0', '--raw-output-data-prefix', 's3://my-s3-bucket/data/dv/f91eee73c35a9498e9c8-n0-0', '--checkpoint-path', 's3://my-s3-bucket/data/dv/f91eee73c35a9498e9c8-n0-0/_flytecheckpoints', '--prev-checkpoint', '""', '--dynamic-addl-distro', 's3://my-s3-bucket/flytesnacks/development/EFP2R3DRITTXTRCHKYZCLZWAZU======/scri pt_mode.tar.gz', '--dynamic-dest-dir', '/root', '--resolver', 'flytekit.core.python_auto_container.default_task_resolver', '--', 'task-module', 'flyte3', 'task-name', 'hello_spark']' returned non-zero exit status 1.
Additionally, if i try to "pyflyte run --remote ..." more than once it doesn't works! (I attached the error at the end of this comment) but it works again if i rename the workflow in the .py file. Were you able to reproduce the sample code shared by Kevin? RPC Failed, with Status: StatusCode.INVALID_ARGUMENT details: launch plan with different structure already exists with id resource_type:LAUNCH_PLAN project:"flytesnacks" domain:"development" name:"flyte3.my_wf" version:"jMuCe4OGZ9LuvDqx089w-Q==" Debug string UNKNOWN:Error received from peer {grpc_message:"launch plan with different structure already exists with id resource_type:LAUNCH_PLAN project:\"flytesnacks\" domain:\"development\" name:\"flyte3.my_wf\" version:\"jMuCe4OGZ9LuvDqx089w-Q==\" ", grpc_status:3, created_time:"2023-05-18T091413.318343238-04:00"}
k

Ketan (kumare3)

05/18/2023, 1:33 PM
This is weird. It is not understanding that something has changed- cc @Kevin Su / @Eduardo Apolinario (eapolinario) this is unexpected @Gabriel Molina is this after you use the new Imagespec
g

Gabriel Molina

05/18/2023, 2:11 PM
@Ketan (kumare3), yes, im using spark_image = ImageSpec(registry="<mydockerhubuser>"), and running it with flytekit==1.6.1 and flytekitplugins-envd==1.6.1
s

Samhita Alla

05/18/2023, 3:34 PM
im able to run the spark example shared by kevin now with --remote flag! but got the following error in the web UI:
Would you be able to fetch the k8s log? https://docs.flyte.org/en/latest/community/troubleshoot.html#troubleshoot
Copy code
kubectl get pods -n flytesnacks-development
kubectl logs <pod-name> -n flytesnacks-development
g

Gabriel Molina

05/18/2023, 3:54 PM
@Samhita Alla Sure!, this is the output
Copy code
tar: Removing leading `/' from member names

╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /usr/local/bin/pyflyte-execute:8 in <module>                                 │
│                                                                              │
│ ❱ 8 │   sys.exit(execute_task_cmd())                                         │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/click/core.py:1130 in __call__       │
│                                                                              │
│ ❱ 1130 │   │   return self.main(*args, **kwargs)                             │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/click/core.py:1055 in main           │
│                                                                              │
│ ❱ 1055 │   │   │   │   │   rv = self.invoke(ctx)                             │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/click/core.py:1404 in invoke         │
│                                                                              │
│ ❱ 1404 │   │   │   return ctx.invoke(self.callback, **ctx.params)            │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/click/core.py:760 in invoke          │
│                                                                              │
│ ❱  760 │   │   │   │   return __callback(*args, **kwargs)                    │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/flytekit/bin/entrypoint.py:471 in    │
│ execute_task_cmd                                                             │
│                                                                              │
│ ❱ 471 │   _execute_task(                                                     │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/flytekit/exceptions/scopes.py:160 in │
│ system_entry_point                                                           │
│                                                                              │
│ ❱ 160 │   │   │   │   return wrapped(*args, **kwargs)                        │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/flytekit/bin/entrypoint.py:347 in    │
│ _execute_task                                                                │
│                                                                              │
│ ❱ 347 │   │   _task_def = resolver_obj.load_task(loader_args=resolver_args)  │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/flytekit/core/utils.py:295 in        │
│ wrapper                                                                      │
│                                                                              │
│ ❱ 295 │   │   │   │   return func(*args, **kwargs)                           │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/flytekit/core/python_auto_container. │
│ py:235 in load_task                                                          │
│                                                                              │
│ ❱ 235 │   │   task_module = importlib.import_module(name=task_module)  # typ │
│                                                                              │
│ /usr/local/lib/python3.10/importlib/__init__.py:126 in import_module         │
│                                                                              │
│ ❱ 126 │   return _bootstrap._gcd_import(name[level:], package, level)        │
│ in _gcd_import:1050                                                          │
│ in _find_and_load:1027                                                       │
│ in _find_and_load_unlocked:1006                                              │
│ in _load_unlocked:688                                                        │
│ in exec_module:883                                                           │
│ in _call_with_frames_removed:241                                             │
│                                                                              │
│ /root/flyte3.py:7 in <module>                                                │
│                                                                              │
│ ❱  7 from flytekitplugins.spark import Spark                                 │
╰──────────────────────────────────────────────────────────────────────────────╯
ModuleNotFoundError: No module named 'flytekitplugins.spark'
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /usr/local/bin/pyflyte-fast-execute:8 in <module>                            │
│                                                                              │
│ ❱ 8 │   sys.exit(fast_execute_task_cmd())                                    │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/click/core.py:1130 in __call__       │
│                                                                              │
│ ❱ 1130 │   │   return self.main(*args, **kwargs)                             │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/click/core.py:1055 in main           │
│                                                                              │
│ ❱ 1055 │   │   │   │   │   rv = self.invoke(ctx)                             │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/click/core.py:1404 in invoke         │
│                                                                              │
│ ❱ 1404 │   │   │   return ctx.invoke(self.callback, **ctx.params)            │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/click/core.py:760 in invoke          │
│                                                                              │
│ ❱  760 │   │   │   │   return __callback(*args, **kwargs)                    │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/flytekit/bin/entrypoint.py:508 in    │
│ fast_execute_task_cmd                                                        │
│                                                                              │
│ ❱ 508 │   subprocess.run(cmd, check=True)                                    │
│                                                                              │
│ /usr/local/lib/python3.10/subprocess.py:526 in run                           │
│                                                                              │
│ ❱  526 │   │   │   raise CalledProcessError(retcode, process.args,           │
╰──────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command '['pyflyte-execute', '--inputs', 
'<s3://my-s3-bucket/metadata/propeller/flytesnacks-development-f52f3d40c3cf849a89>
5a/n0/data/inputs.pb', '--output-prefix', 
'<s3://my-s3-bucket/metadata/propeller/flytesnacks-development-f52f3d40c3cf849a89>
5a/n0/data/0', '--raw-output-data-prefix', 
'<s3://my-s3-bucket/data/rr/f52f3d40c3cf849a895a-n0-0>', '--checkpoint-path', 
'<s3://my-s3-bucket/data/rr/f52f3d40c3cf849a895a-n0-0/_flytecheckpoints>', 
'--prev-checkpoint', '""', '--dynamic-addl-distro', 
'<s3://my-s3-bucket/flytesnacks/development/U67AW4X3WQLPZXY2HBRFM7GDWY======/scri>
pt_mode.tar.gz', '--dynamic-dest-dir', '/root', '--resolver', 
'flytekit.core.python_auto_container.default_task_resolver', '--', 
'task-module', 'flyte3', 'task-name', 'hello_spark']' returned non-zero exit 
status 1.
k

Kevin Su

05/18/2023, 5:48 PM
sorry, actually. you can’t use fast-register if spark task is in the workflow because spark worker won’t download the code. try to use
pyflyte register …
instead.
k

Ketan (kumare3)

05/19/2023, 4:31 AM
@Kevin Su why is that?
should it not download the code?
k

Kevin Su

05/19/2023, 4:58 AM
when I use fast-register, only driver downloaded code
82 Views