Hi team, In flytekit-spark plugin, we have to add ...
# flytekit
f
Hi team, In flytekit-spark plugin, we have to add spark config @task(task_config=Spark(......)) and specify sess = flytekit.current_context().spark_session in a flyte task. I have 2 questions, see if anyone can help on it: 1. In spark-submit command, it should pass the pyspark job in a .py file, can we submit the py file in flytekit spark plugin? 2. How's the mechanism behind for flytekit-spark, without specifying a .py file?
t
flytekit-spark plugin uses spark operator under the hood: https://github.com/kubeflow/spark-operator. It "automatically runs
spark-submit
on behalf of users for each
SparkApplication
eligible for submission."
f
The question is what is the py file. Great question. This is the entrypoint script, which is packaged with flytekit
f
This is very useful information. Thank you