https://flyte.org logo
#ask-the-community
Title
# ask-the-community
a

andrewm4894

03/12/2024, 4:26 PM
when we pass df to and from flyte tasks - is there any native way to ask Flyte to do sampling as its reading and writing the data? eg i sometimes want to just randomly sample or write N rows to/from a task and wondering if Flyte could in some way do this natively?
use case here is say task A makes a big df task B1 and B2 need that as input - B1 needs it all but B2 actually only needs a random sample of N from df
e

Eduardo Apolinario (eapolinario)

03/13/2024, 1:35 AM
@andrewm4894, we don't support this today. That said, how should the sampling be expressed? Let's take your example, how would sampling of the upstream df be defined?
a

andrewm4894

03/13/2024, 6:24 AM
I'm not 100% sure tbh, maybe I could pass the sampling param to the task decorator. Maybe it's a horrible idea idk. Just that I find myself having some tasks where first thing I do is quickly downsample the data right away.
e

Eduardo Apolinario (eapolinario)

03/13/2024, 8:10 PM
@andrewm4894, can you capture that in github issue? [flyte-core]
a

andrewm4894

03/13/2024, 8:35 PM