better-toddler-72553
04/07/2022, 9:39 AMgreat-school-54368
04/07/2022, 10:11 AMgreat-school-54368
04/07/2022, 10:11 AMbetter-toddler-72553
04/07/2022, 12:29 PM@task
def get_file(url) -> FlyteFile:
return FlyteFile(url)
@task(task_config=Spark(...))
def spark_task(file: FlyteFile):
spark = flytekit.current_session().spark_session
df = spark.read.text(file)
@workflow
def pipeline(url="<gs://xxxx>"):
file = get_file(url)
spark_task(file)
Is this the right way to do it?great-school-54368
04/07/2022, 12:34 PMgreat-school-54368
04/07/2022, 12:35 PMbetter-toddler-72553
04/07/2022, 12:37 PMgreat-school-54368
04/07/2022, 12:37 PMbetter-toddler-72553
04/07/2022, 12:38 PMbetter-toddler-72553
04/07/2022, 1:07 PM'FlyteFile' object has no attribute '_get_object_id'
great-school-54368
04/07/2022, 1:20 PMfile.download()
, @tall-lock-23197 Can you confirm the logic ?tall-lock-23197
file.download()
is required.better-toddler-72553
04/07/2022, 1:46 PMfile.download()
should be feed into spark.read.text(..)
right?great-school-54368
04/07/2022, 1:54 PMfreezing-airport-6809
freezing-airport-6809
freezing-airport-6809
freezing-airport-6809
freezing-airport-6809
freezing-airport-6809
better-toddler-72553
04/07/2022, 2:01 PMfreezing-airport-6809
freezing-airport-6809
freezing-airport-6809
freezing-airport-6809
freezing-airport-6809
better-toddler-72553
04/07/2022, 2:08 PMfreezing-airport-6809
freezing-airport-6809