https://flyte.org logo
#ask-the-community
Title
# ask-the-community
c

cryptic

10/18/2022, 3:46 AM
this is probably a silly question but i'm looking to serialize
tf.tensor
. The
tf.io.serialize_tensor
returns a string, should we store it in
tf.train.Feature
? and further put this into
tf.train.Example
and use
SerializeToString()
method to serialize the proto? Is there a need to store it on to the disk? i'm not sure if storing on disk is possible or not
k

Ketan (kumare3)

10/18/2022, 4:04 AM
@cryptic what do you mean store it to disk, Do you want to pass tf.tensor between tasks? if so you can follow the pattern same as PytorchTensorTransformer. Transformers help encode the best methods of serializing and deserializing data into reusable patterns so that data can be passed in efficient ways between tasks
c

cryptic

10/18/2022, 4:22 AM
what exactly is the
PyTorchTypeTransformer
actually doing? In the
PyTorchTypeTransformer
there is a step where we are locally saving the tensor/module @Ketan (kumare3)
s

Samhita Alla

10/18/2022, 4:40 AM
The transformer enables using
torch.Tensor
as a Flyte type. It serializes and deserializes the tensor as the data passage happens among Flyte tasks.
c

cryptic

10/18/2022, 12:28 PM
@Samhita Alla I mean to ask, in a similar way (to the above PyTorch example) while serializing tf.tensor will i've to store the tensor on disk (using tf.io.write_file) or we can directly pass the binary string type that is obtained after performing tf.io.serialize_tensor on the tensor ?
or probably both of the functions are doing different things
k

Ketan (kumare3)

10/18/2022, 1:17 PM
So if we are to send the tensor between tasks then yes you have to store. I think an action item is to stream to s3
k

Ketan (kumare3)

10/18/2022, 1:33 PM
Yes but this has to be small less than 1MB
s

Samhita Alla

10/18/2022, 1:35 PM
Oh, then storing the data in a file is a better choice.
r

Ryan Nazareth

10/18/2022, 2:09 PM
@cryptic i have created a PR https://github.com/flyteorg/flytekit/pull/1240 for TensorflowExampleTransformer to pass
tf.train.Example
between tasks (this would automatically serialise and deserialise from tfrecord file.). but your data would need to be in
tf.train.Example
message format. I believe there is another issue and work being done on supporting passing
tf.tensor
between tasks
c

cryptic

10/18/2022, 2:20 PM
@Ryan Nazareth I believe the tf.train.Example is only for TFRecords exclusively as it requires a dictionary kind of structure