I am looking for a way to pass Numpy Arrays (ndarray) and PyTorch/Tensorflow Tensors as Flyte Task i...

straight-laptop-71325

05/20/2022, 6:26 PM

I am looking for a way to pass Numpy Arrays (ndarray) and PyTorch/Tensorflow Tensors as Flyte Task input/output. I haven’t come across any example yet. I’m aware of the native support for Dataframes. It seems inefficient to convert ndarray/Tensors back and forth using Dataframes. How are folks handling this?

acceptable-policeman-57188

05/20/2022, 7:48 PM

cc @broad-monitor-993 @high-accountant-32689

broad-monitor-993

05/20/2022, 7:57 PM

unfortunately the flytekit TypeEngine doesn’t have native support for numpy arrays or pytorch/tensorflow tensors… would you mind opening up an issue for that @straight-laptop-71325? Currently there are 3 paths to doing this: 1. passing dataframes around (as you’ve suggested) 2. passing

List[int]

List[float]

and reconstituting your arrays/tensors at the beginning of the next task 3. using a

np.ndarray

torch.Tensor

annotation purely for human-readability. Under the hood this will pickle your array/tensor and unpickle it on the other side. (3) is convenient, but you run the risk of deserialization issues if you happen to use different versions of python/numpy/pytorch/tensorflow across your tasks that are not cross-compatible. (2) is really for smaller data use cases since these are stored as FlyteIDL literals. (1) is nice because flyte understands this and stores dataframes as parquet files, which is a more efficient/reliable storage format than pickle.

👍 1

straight-laptop-71325

05/20/2022, 8:11 PM

Thanks @broad-monitor-993. This is helpful! I’ll open an issue for this.

freezing-airport-6809

05/21/2022, 4:51 PM

Also @straight-laptop-71325 our goal is to add support for automatic marshal/unmarshal or tf.tensor - this can be added using type engine plugins. Just not done yet. Contributions welcome. Docs: https://docs.flyte.org/projects/cookbook/en/latest/auto/core/extend_flyte/custom_types.html#sphx-glr-auto-core-extend-flyte-custom-types-py

👍 2

tall-lock-23197

05/25/2022, 12:53 PM

https://github.com/flyteorg/flyte/issues/2544

👍 2

328 Views

Open in Slack

Previous Next

Flyte

Flyte enables production-grade orchestration for machine learning workflows and data processing created to accelerate local workflows to production.