Just noticed some interesting behavior in GitHub A...
# flytekit
r
Just noticed some interesting behavior in GitHub Actions when executing a launch plan during CI which had objects that used
PickleTransformer
passed as args to the LaunchPlan 🧵
Is there a particular reason
S3Persistence
is implemented using subprocess calls to awscli vs using boto3 to directly interact with S3? It appears that the version of awscli in our github actions runner image expects python 3.8, but python 3.9 is the only available distribution in the environment. Seems like this could be sidestepped entirely by avoiding the subprocess calls?
s
cc: @Ketan (kumare3)
k
Yes this is legacy reason. As earlier we used subprocess, but we are actively recommending using fsspec - https://docs.flyte.org/projects/flytekit/en/latest/plugins/fsspec.html
You can install it using the data persistence plugins and we are thinking of making it default - and not breaking folks will be hard - cc @Eduardo Apolinario (eapolinario) / @Yee
@Samhita Alla can you point to docs of how to install flytekitplugins-data-fsspec[aws]
r
Got it, that makes sense. It’d be nice if it was the default, but understand not wanting to break workflows. Maybe something for the flytekit 2.x millstone?
k
Hmm we are actually thinking we can sidestep that - Flyte 1.3?
Would you be open to helping
let’s bring this up In a townhall
r
Sure, happy to chat more about this
k
Need help In benchmarking too
We are actually thinking of writing a faster layer
s
flytekitplugins-data-fsspec[aws]
requires
flytekitplugins-data-fsspec
and
s3fs>=2021.7.0
libraries.
r
Thanks very much @Samhita Alla, will give that a try
154 Views