Georgi Ivanov
10/16/2023, 1:18 PMKetan (kumare3)
Georgi Ivanov
10/16/2023, 2:16 PMKetan (kumare3)
Georgi Ivanov
10/16/2023, 2:19 PMKetan (kumare3)
L godlike
10/16/2023, 3:44 PMGeorgi Ivanov
10/16/2023, 6:13 PML godlike
10/18/2023, 2:43 AMKetan (kumare3)
L godlike
10/18/2023, 3:23 AMGeorgi Ivanov
10/31/2023, 5:03 PML godlike
11/01/2023, 2:24 AMKetan (kumare3)
L godlike
11/01/2023, 4:45 AMGeorgi Ivanov
11/01/2023, 6:29 PML godlike
11/02/2023, 3:28 AMKetan (kumare3)
L godlike
11/02/2023, 3:31 AMGeorgi Ivanov
11/03/2023, 1:09 AML godlike
11/03/2023, 1:11 AMGeorgi Ivanov
11/03/2023, 1:53 AML godlike
11/03/2023, 2:33 AMKetan (kumare3)
L godlike
11/03/2023, 2:56 PMGeorgi Ivanov
11/03/2023, 11:37 PML godlike
11/04/2023, 8:42 AMKevin Su
11/05/2023, 4:44 AML godlike
11/05/2023, 6:25 AMGeorgi Ivanov
11/06/2023, 9:31 PMKetan (kumare3)
Georgi Ivanov
11/07/2023, 10:16 PMKevin Su
11/07/2023, 10:28 PMpyflyte run spark.py wf # Run spark in local python process
pyflyte run --raw-output-data-prefix <s3://saprk/output> spark.py wf # We serialize the input and upload to s3, and trigger the databricks job.
The reason to do that because it’s easy to develop, and test agent locally. In addiction, you are able to run databricks task without flyte cluster.
my question is should we use different flag? like --agent, --hybrid. Any feedback is appreciatedFrank Shen
11/13/2023, 7:37 PMtask_config=Spark(
spark_conf={
....
# The following is needed only when running spark task in dev's local PC. Also need to do this locally: export SPARK_LOCAL_IP="127.0.0.1"
"spark.hadoop.fs.s3a.access.key": "aaa",
"spark.hadoop.fs.s3a.secret.key": "bbb",
"spark.hadoop.fs.s3a.session.token": "ccc",
},
),
Kevin Su
11/13/2023, 7:39 PMThat would be very nice to have especially when we can do this locally for spark tasks that runs on k8s spark operator by passing the AWS S3 creds locally like this.so you want to run a spark workflow locally, but run the spark job on k8s?
Frank Shen
11/13/2023, 7:41 PMKevin Su
11/13/2023, 7:43 PMFrank Shen
11/13/2023, 7:44 PMKevin Su
11/13/2023, 7:46 PMpyflyte run spark.py wf # Run spark in local python process
pyflyte run --raw-output-data-prefix <s3://saprk/output> spark.py wf # We serialize the input and upload to s3, and submit the databricks job.
is this command confused to you? or should we add new flag, like pyflyte run --hybrid, --agent, or something elseKetan (kumare3)
Georgi Ivanov
11/15/2023, 2:34 PM