If we have an already containerized Spark job we would like to run as a raw container using ContainerTask(), can we use this task in a workflow and ensure this gets submitted to the Spark Operator to run in cluster mode? cc: @Vasu Upadhya
k
Ketan (kumare3)
08/25/2022, 11:55 PM
It’s raw container - you have to do it
But I don’t understand completely- is this an independent spark job? Not using flytekit
k
Katrina P
08/26/2022, 2:39 PM
Yes, it is an independent container running a spark job; we want to use some existing jobs we have running outside of flyte in our workflow without having to re-write the jobs for now (if possible)
k
Ketan (kumare3)
08/26/2022, 2:43 PM
Spark on k8s needs some specif container
once you have that absolutely you can use raw container
woth right service account
k
Katrina P
08/26/2022, 3:47 PM
Yes the container runs in spark already with the SparkOperator outside of flyte
We just want to be able to run it as a task as part of a workflow
k
Ketan (kumare3)
08/26/2022, 3:47 PM
Ya that should be fine
k
Katrina P
08/26/2022, 3:48 PM
Would Flyte submit it to the SparkOperator or would it end up being run in single node mode?
(assuming we use spark service acount to run the workflow)
k
Ketan (kumare3)
08/26/2022, 3:51 PM
Ohh Flyte will not submit
To spark operator
i thought you want in client mode
flyte will only submit if task type is spark
and some fields are
Set, check the flytekit plugin for more details