Théo LACOUR02/09/2023, 2:26 PM
, and then use
to package the code and then upload and run the new version of the workflow, we notice that the behavior
pyflyte --pkgs path.to.code package --image my_image --force --fast
is the same as before. As I understand it now, this could be because the Spark executors still have the previous code. Does this work as intended ? TL;DR : can we use
to avoid building / pushing Docker images for Spark tasks ?
pyflyte run --remote --image <http://ghcr.io/flyteorg/flytecookbook:k8s_spark-43585feeccabc8a48452dc6838426f3acf4c6a9d|ghcr.io/flyteorg/flytecookbook:k8s_spark-43585feeccabc8a48452dc6838426f3acf4c6a9d> pyspark_pi.py my_spark --triggered_date now
Théo LACOUR02/10/2023, 9:56 AM
with a correct project, domain and service account. Then I use the console to run the workflow. The behavior I am expecting : the new code packaged in my .tgz file should be used by the Spark executors, since I used the
flytectl register files --archive flyte-package.tgz etc.
tag. What I observed : the Spark driver code is updated, but the Spark executors code is the one from the image, not the one from the .tgz file. I wonder if this is expected behavior, and if it is, how can I register a workflow with new code without having to re-build a Docker image ?
Evan Sadler02/10/2023, 3:09 PM
Evan Sadler02/10/2023, 3:17 PM
pyflyte register -*-destination-dir .*
Théo LACOUR02/10/2023, 3:51 PM
Evan Sadler02/10/2023, 3:52 PM
or whatever I had set it to. Good luck!
Théo LACOUR02/13/2023, 1:17 PM
in my command `flytectl register etc.`which has the same effect as
. It did not solve my problem (as my code did actually run in
pyflyte register --destination-dir etc.
) so I might write a GitHub issue later, unless this works 'as intended' or as a limitation of Spark (i.e. Spark executors should be expected to pull the image and use it 'as is' instead of using the updated code).
Tyler Su02/15/2023, 11:11 PM
Théo LACOUR02/16/2023, 9:23 AM
Yini Gao04/12/2023, 10:54 AM