full-ram-17934
10/04/2024, 5:56 PMjolly-florist-66728
10/04/2024, 5:56 PMspark.driver.extraClassPath
and spark.executor.extraClassPath
in the task's Spark configuration to point to these JARs. This allows you to call into your Scala Spark driver code from the PySpark Python code.
Here is an example of how you can configure a Flyte Spark task to run a Scala Spark job:
from flytekit import task
from flytekitplugins.spark import Spark
@task(
task_config=Spark(
spark_conf={
"spark.driver.extraClassPath": "/path/to/your/jarfile.jar",
"spark.executor.extraClassPath": "/path/to/your/jarfile.jar",
}
),
)
def spark_task() -> float:
sess = flytekit.current_context().spark_session
return sess.sparkContext._jvm.com.my.scala.package.ScalaDriver.go()
Would you like more details on this setup?
Sources:
- https://discuss.flyte.org/t/8022179/we-are-trying-to-execute-sparkjobs-written-in-scala-one-stra/
- https://github.com/flyteorg/flyte/pull/718 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.