Hey Flyte, I am on Flyte backend system `v1.0.0` w...
# flytekit
a
Hey Flyte, I am on Flyte backend system
v1.0.0
with
flytekit==0.26.0
and Spark tasks are fine. I just moved up (last night) to
flytekit==1.0.1
with
flytekitplugins-spark==1.0.1
(with the same backend system) and now Spark tasks are broken with the error
can't open file '/usr/bin/entrypoint.py': [Errno 2] No such file or directory
. I took a look and the old (good) tasks have the following set:
Copy code
"mainApplicationFile": {
    "stringValue": "local:///usr/local/lib/python3.8/dist-packages/flytekit/bin/entrypoint.py"
}
and that file exists and looks fine. However, my new (broken) Spark task shows the following:
Copy code
"mainApplicationFile": {
    "stringValue": "local:///usr/bin/entrypoint.py"
},
"executorPath": {
    "stringValue": "/usr/bin/python3.8"
}
but I have no
/usr/bin/entrypoint.py
in the container (which explains the error message). This seems 99% like a bug, can you take a look? I do have a
v1.0.1
Flyte backend system up and running... I'll try the same thing there.
Note: in the container I set the environment variables:
Copy code
PYSPARK_DRIVER_PYTHON=python3.8
PYSPARK_PYTHON=python3.8
since I want to use
python3.8
(and not the
python3
installed in my Bionic container). It doesn't seem like these env variables would affect the issue with
mainApplicationFile
, however.
Same problem with Flyte backend components from
v1.0.1
. All these have been with
flytectl register files
. I'll also switch over to
flyte-cli register files
and check if it makes any difference.
Still broken with
flyte-cli register files
also
TLDR: Spark tasks are broken in
flytekit==1.0.1
due to an error setting the wrong
mainApplicationFile
path. They were working for me back in flytekit 0.26.0 so something broke them after that.
Flyte team - could you Slack me here in the Flyte Slack workspace when you have a fix? I'll get a notification.
k
Spark tasks are not broken as we do have tests
It's some setting that probably got changed in 1,
You have to pass python interpreter path
This was the case at woven planet did something change
I will Share the cli param Tomorrow
a
Cool, good to know it is just some kind of configuration change. Let me know what param to set...
I am just using
python3.8
installed at
/usr/bin/python3.8
. FYI that since there is also a separate (3.6.x) Python 3 at /usr/bin/python3, I have declared:
Copy code
PYSPARK_DRIVER_PYTHON=python3.8
PYSPARK_PYTHON=python3.8
environment variables in my container.
a
@Ketan (kumare3) I have found the issue. Back in
0.26.0
which is working for us, the path to the Flyte
entrypoint.py
file was formed explicitly and correctly here
For me on
0.26.0
the value of
flytekit_virtualenv_root
is
'/usr/local/lib/python3.8/dist-packages/flytekit'
which is determined correctly in the code from:
Copy code
import flyte;
flytekit_install_loc = os.path.abspath(flytekit.__file__)
ctx.obj[CTX_FLYTEKIT_VIRTUALENV_ROOT] = os.path.dirname(flytekit_install_loc)
However in the latest code the explicit
entrypoint
path determination is dropped. It is now calculated here as a default based on the Python interpreter path which is just
/usr/bin/python3.8
in my container. Whomever made the change clearly assumed people would only be running
pyflyte
commands from a venv and not with the system interpreter.
I think these things should really try to lookup the location of the
pyflyte
executable being called and assume that
entrypoint.py
next to that (not sure if this is possible in Python).
If that is not possible, at least we know that the previous method that set the entrypoint by finding the directory of the
flytekit
package and appending
/bin/entrypoint.py
does work robustly.
This will surely come up again - someone will
/usr/bin/python3 pip install flytekit flytekitplugins-spark
in their container and to try Flyte and that will fail.
I'll use
pyflyte register --in-container-virtualenv-root /usr/local/lib/python3.8/dist-packages/flytekit
as a workaround for now. I think that should get Spark tasks working for me again. Thanks for pointing that out! Lol that Miguel wrote it. I looked through our Git and we had actually stopped using that a while back, but we'll use it again.
It turned out that using
--in-container-virtualenv-root
can't be used as a workaround when I'm actually using
/usr/bin/python3.8
to install things in my container. The right workaround for the issue is just
ln -s /usr/local/bin/entrypoint.py /usr/bin/entrypoint.py
in the container. This fixes it; my Spark jobs are working again.
This ^^^^ issue will definitely come up again; allocate some memory to remember this thread when it does!
k
@Alex Bain sorry for the trouble
Can you tell us what you are doing
Why isn’t the venv - /usr/bin
m
@Alex Bain I actually think I did the same sym link solution for the bazel rules. Can't remember where I used the
--in-container-virtualenv-root
stuff off the top of my head
163 Views