Hi <#CP2HDHKE1|>, We are testing the new flytekit (1.14.6) with Databricks plugin, the <entrypoint....
a
Hi #CP2HDHKE1, We are testing the new flytekit (1.14.6) with Databricks plugin, the entrypoint.py is failing with this error on Databricks side:
Copy code
{"asctime": "2025-02-14 17:05:58,888", "name": "flytekit", "levelname": "ERROR", "message": "Trace:\n\n    Traceback (most recent call last):\n      File \"/databricks/python/lib/python3.10/site-packages/flytekit/bin/entrypoint.py\", line 179, in _dispatch_execute\n        outputs = task_def.dispatch_execute(ctx, idl_input_literals)\n      File \"/databricks/python/lib/python3.10/site-packages/flytekit/core/base_task.py\", line 728, in dispatch_execute\n        new_user_params = self.pre_execute(ctx.user_space_params)\n      File \"/databricks/python/lib/python3.10/site-packages/flytekitplugins/spark/task.py\", line 209, in pre_execute\n        shutil.make_archive(file_name, file_format, os.getcwd())\n      File \"/usr/lib/python3.10/shutil.py\", line 1124, in make_archive\n        filename = func(base_name, base_dir, **kwargs)\n      File \"/usr/lib/python3.10/shutil.py\", line 1009, in _make_zipfile\n        zf.write(path, arcname)\n      File \"/usr/lib/python3.10/zipfile.py\", line 1754, in write\n        zinfo = ZipInfo.from_file(filename, arcname,\n      File \"/usr/lib/python3.10/zipfile.py\", line 523, in from_file\n        zinfo = cls(arcname, date_time)\n      File \"/usr/lib/python3.10/zipfile.py\", line 366, in __init__\n        raise ValueError('ZIP does not support timestamps before 1980')\n    ValueError: ZIP does not support timestamps before 1980\n\nMessage:\n\n    ValueError: ZIP does not support timestamps before 1980"}
{"asctime": "2025-02-14 17:05:58,891", "name": "flytekit", "levelname": "ERROR", "message": "!! End Error Captured by Flyte !!"}
Obviously, passing
strict_timestamps = False
to
zipfile.ZipFile
call would do the trick, but as I understand it, flytekitplugins / spark relies on
shutil.make_archive
which still does not support the
strict_timestamp
param (see this open PR). I have also seen this open Flyte issue: https://github.com/flyteorg/flyte/issues/4711 (that's about removing datetime metadata from files) - that would probably solve the problem too. Anyway all these issues are open for a while. Do you have any recommendations how we can use fast registration with
flytekit 1.14.6
and
Spark
?
c
@glamorous-carpet-83516, have you seen this happen in the past?
g
could you use flytekit 1.15.0 to register the spark task instead? we add timestamp to the tar file in the 1.15. https://github.com/flyteorg/flytekit/commit/f394bc95b94798e856649a2b07e2d87528ebd2cb#diff-96cd06ed1aa01292b76a1ee83[…]2785efe5898d44699141ee0R72-R74
a
@worried-pager-82302 As long as it is backwards compatible, there should be no problem. We are on flyte 1.14.1, are you aware of any breaking changes that prevent flytekit 1.15.0 from being used with flyte 1.14.1?
@worried-pager-82302 hmm, just checked the release notes and found that the above change was already part of
v.1.13.6
: https://github.com/flyteorg/flytekit/releases/tag/v1.13.6 Am I right? If so, I'm confused. I assume it should also work with flytekit 1.14.6 also. Am I missing something here?
It seems we found the root cause, we are in UTC+1 timezone, the timestamp in
.tar.gz
will be:
1979-12-31 23:00
That's why we have the
ZIP does not support timestamps before 1980
error.
It seems to be an edge case for UTC+ timezones.
probably changing
tar_info.mtime = datetime(1980, 1, 1).timestamp()
to
tar_info.mtime = datetime(1980, 1, 1, tzinfo=timezone.utc).timestamp()
would fix the issue
@worried-pager-82302 pls let me know what you think
cc @full-toddler-5766
cc @careful-holiday-56196
g
ah, thanks for catching this. we can update it, and backport to 1.14
a
thanks a lot!
Hi @glamorous-carpet-83516, Thanks for the quick turnaround, tested
flytekit 1.14.7
, the timestamp issue is resolved now. However, after this fix, the
entypoint.py
in Databricks randomly stuck in infinite loop. We have tried to localize the issue and found that it was introduced in
flytekit == 1.13.6
, no issues in
flytekit == 1.13.5
. Randomly stuck in ZipFile.write calls:
Any idea on that? Let me share the stack trace of flytekit calls.
g
does it work if you use non-fast register?
Copy code
pyflyte register --non-fast ...
does you workflow code inside your image already?
f
Hi @glamorous-carpet-83516, We analyzed this further and it first breaks with this commit: https://github.com/flyteorg/flytekit/commit/f394bc95b94798e856649a2b07e2d87528ebd2cb
g
flytekit copies the code to the executors in fast-register mode. if you disable fast-register mode, does it work for you?
pyflyte register --non-fast …
a
@glamorous-carpet-83516 it works when we use non-fast register, however our users heavily utilize fast registration to speed up development workflow. Unless fast register is deprecated, we need to figure out what went wrong here.