cool-umbrella-29339
09/05/2022, 6:59 AMvictorious-park-53030
09/05/2022, 7:20 AMcool-umbrella-29339
09/05/2022, 7:31 AMvictorious-park-53030
09/05/2022, 7:37 AMglamorous-carpet-83516
09/05/2022, 8:16 AM@task(
requests=Resources(
gpu=gpu, mem=mem, storage=storage, ephemeral_storage=ephemeral_storage
),
)
def pytorch_mnist_task():
...
We have a torch.tensor
transformer in flytekit, which means you can directly return a torch.tensor
in your task. Flyte will help your save this data in s3, and the downstream task can fetch data from the bucket.
@task
def t1() -> torch.Tensor:
return torch.tensor([[1.0, -1.0, 2], [1.0, -1.0, 9], [0, 7.0, 3]])
@task
def t2(df: torch.Tensor)
...
@workflow
def wf():
df = t1()
t2(df=df)
cool-umbrella-29339
09/05/2022, 8:20 AMglamorous-carpet-83516
09/05/2022, 8:20 AMcool-umbrella-29339
09/05/2022, 8:21 AMglamorous-carpet-83516
09/05/2022, 8:25 AMtall-lock-23197
cool-umbrella-29339
09/05/2022, 8:35 AMtall-lock-23197
Great to hear about the parallelization, so there is a dispatcher that determines what receives what resources etc?Yes! Refer to https://docs.flyte.org/en/latest/deployment/cluster_config/performance.html#worst-case-workflows-poison-pills-max-parallelism for an in-depth view of workers to run the flyte code.
I don’t want copies because all I need is a pipeline to connect units that already run on GPU and the copies are just overhead.Copying is inevitable cause the data would be uploaded and downloaded in case of tasks’ communication. If there’s no copy available, you wouldn’t be able to access the tensor in your other task where it’s an input.
cool-umbrella-29339
09/05/2022, 9:48 AMfreezing-airport-6809
freezing-airport-6809