When trying to register a workflow using the spark...
# ask-the-community
When trying to register a workflow using the spark plugin, i get this error when trying to register it:
Copy code
Traceback (most recent call last):
  File "/usr/local/bin/pyflyte", line 5, in <module>
    from flytekit.clis.sdk_in_container.pyflyte import main
  File "/usr/local/lib/python3.11/site-packages/flytekit/__init__.py", line 305, in <module>
  File "/usr/local/lib/python3.11/site-packages/flytekit/__init__.py", line 301, in load_implicit_plugins
  File "/usr/local/lib/python3.11/importlib/metadata/__init__.py", line 202, in load
    module = import_module(match.group('module'))
  File "/usr/local/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/usr/local/lib/python3.11/site-packages/flytekitplugins/spark/__init__.py", line 20, in <module>
    from .agent import DatabricksAgent
  File "/usr/local/lib/python3.11/site-packages/flytekitplugins/spark/agent.py", line 12, in <module>
    from flytekit.extend.backend.base_agent import AgentBase, AgentRegistry, convert_to_flyte_state, get_agent_secret
ImportError: cannot import name 'convert_to_flyte_state' from 'flytekit.extend.backend.base_agent' (/usr/local/lib/python3.11/site-packages/flytekit/extend/backend/base_agent.py)
In looking at the underlying code, it looks like those imports are missing. Im using the following versions in my project:
Copy code
flytekit = "1.10.7"
flytekitplugins-async-fsspec = "1.10.7"
flytekitplugins-duckdb = "1.10.7"
flytekitplugins-deck-standard = "1.10.7"
flytekitplugins-polars = "1.10.7"
flytekitplugins-spark  = "1.10.7"
flytekitplugins-pod = "1.10.7"
Is there something im missing?
could you try flytekit==1.11.0 and flytekitplugins-spark==1.11.0?
Thats actually what i had, prior to downgrading to 1.10.7. Same issue
Copy code
File "/usr/local/lib/python3.11/site-packages/flytekitplugins/spark/agent.py", line 12, in <module>
    from flytekit.extend.backend.base_agent import AgentBase, AgentRegistry, convert_to_flyte_state, get_agent_secret
hmm, but we already remove
from spark agent since flytekit==1.11.0 https://github.com/flyteorg/flytekit/pull/2123
I can try 1.11.0 again. Maybe my poetry lock file wasnt updated, let me take a look
@Kevin Su if we change like we have to change the min version pin right
@Kevin Su going back to 1.11.0 fixed the issue. But i havent been able to tell what is causing this error:
Copy code
[1/1] currentAttempt done. Last Error: USER::The node was low on resource: ephemeral-storage. Threshold quantity: 2146223340, available: 1752248Ki. 
[flytesnacks-dev] terminated with ExitCode 0.
[primary] terminated with exit code (1). Reason [Error]. Message:
I mean, i get what it staying, but im not sure what ephemeral-storage means in the context of the plugin or flyte. any thoughts?
is this related to the size of the image?
this is a known issue in flyte 1.11.0. we accidentally set the default ephemeralStorage to 20 MB https://github.com/flyteorg/flyte/pull/4929/files#diff-33b4463f6057591a533425d1f947752711a81da1952ff745ed9fae049e155995L181
you can fix that by updating the default.
Gotcha, okay - so do i need to change this in my task decorator, or is this at the kub cluster config levle?
wait, sorry. I was wrong. 1752248Ki is more than 20MB.
could you try to increase storage in the task decorator
might relate to the size of the image
gotcha, so would that be in the spark_conf property or in the resources property?
Copy code
            "spark.driver.memory": "5000M",
            "spark.executor.memory": "5000M",
            "spark.executor.cores": "2",
            "spark.executor.instances": "2",
            "spark.driver.cores": "2",
            #"spark.jars": "<https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-hadoop3-latest.jar>",
resources property
Copy code
okay, let me give that a try!!! thanks so much for your help!!!
Well i got past that issue, but now ive been stuck on this odd error
Copy code
[1/1] currentAttempt done. Last Error: USER::Pod failed. No message received from kubernetes.
[flytesnacks-dev] terminated with ExitCode 0.
[primary] terminated with exit code (1). Reason [Error]. Message: 
any thoughts on this?
are you able to share the code snippet