Hello, team, I'm presently trying to integrate a D...
# flyte-support
g
Hello, team, I'm presently trying to integrate a Databricks Spark Scala job into an existing flyte workflow. I found a doc which goes into some details of using DB plugin to run analysis over the MovieLens data using python. Is there a way for the DB plugin to launch a Scala Spark job and pass it some command-line arguments?
I think I can make this work with an instance of a SparkJob: https://github.com/flyteorg/flytekit/blob/cc3a7a9277852dc06bc1b2c041a962be973d8faf/plugins/flytekit-spark/flytekitplugins/spark/models.py#L19 where: spark_type: SCALA application_file: path to JAR file in object storage main_class: Main: plus some databricks token and instance. what I cannot see is a way to inject command-line args so they get passed to the Databricks API. I think I could possibly insert them into the
databricks_conf
attribute of
SparkJob
but I'm not entirely certain how this attribute is used to construct the API call. Also found this open PR and added some questions there. https://github.com/flyteorg/flytekit/pull/767