rich-monitor-45380
08/02/2022, 3:51 AMhorovod.spark.run to execute the distributed training function" means that flyte will launch num_proc spark workers on num_proc flyte worker? Or it just launches them on one spark worker?freezing-airport-6809
Flyte workers today. Flyte launches ephemeral Spark clusters using Spark operator or Spark for K8s.
In the horovod case, it simply uses the horovod with spark integration, just make sure the configuration is correctrich-monitor-45380
08/02/2022, 5:57 AMFlyte launches ephemeral Spark clusters using Spark operator or Spark for K8s.I see. Thanks!
rich-monitor-45380
08/02/2022, 6:12 AMfreezing-airport-6809
rich-monitor-45380
08/04/2022, 6:46 AMFlyte tries to prevent starting the spark cluster itself, if the task is cachedNow I can understand that the spark job (launching the spark cluster using spark-operator and submit a job to it) is "a single task" in a flyte workflow.
rich-monitor-45380
08/04/2022, 6:51 AMfreezing-airport-6809
rich-monitor-45380
08/05/2022, 1:48 AMfreezing-airport-6809
rich-monitor-45380
08/05/2022, 2:34 AMfreezing-airport-6809
rich-monitor-45380
08/05/2022, 4:05 AMfreezing-airport-6809
freezing-airport-6809