Sample Code with S3 Hello all, I have been trying ...
# ask-the-community
v
Sample Code with S3 Hello all, I have been trying to run some sample integration with Amazon S3 using FlyteFile but have been successful so far. Goal: Use a public remote url to fetch a csv file, read it as a dataframe and then upload the same to AWS S3 bucket as a csv file. Please find attached file for my working code Here is the command I am running:
pyflyte run simple_s3.py read_and_modify_df --url "<https://people.sc.fsu.edu/~jburkardt/data/csv/biostats.csv>"
And the error I am seeing is this:
Copy code
Failed with Unknown Exception <class 'TypeError'> Reason: Encountered error while executing workflow 'simple_s3.read_and_modify_df':
  Error encountered while executing 'read_and_modify_df':
  Failed to convert outputs of task 'simple_s3.upload_df' at position 0:
  Failed to put data from /tmp/flytei2h5ttap/user_spaced6y797dn/my_df.csv to <https://MY-BUCKET.s3.us-east-2.amazonaws.com/my_df.csv> (recursive=False).

Original exception: 403, message='Forbidden', url=URL('<https://MY-BUCKET.s3.us-east-2.amazonaws.com/my_df.csv>')
Encountered error while executing workflow 'simple_s3.read_and_modify_df':
  Error encountered while executing 'read_and_modify_df':
  Failed to convert outputs of task 'simple_s3.upload_df' at position 0:
  Failed to put data from /tmp/flytei2h5ttap/user_spaced6y797dn/my_df.csv to <https://MY-BUCKET.s3.us-east-2.amazonaws.com/my_df.csv> (recursive=False).

Original exception: 403, message='Forbidden', url=URL('<https://MY-BUCKET.s3.us-east-2.amazonaws.com/my_df.csv>')
j
Is there a reason why you are not using the s3 URI format?
<s3://bucket/>..
? Looks like a permission error. Are you able to upload from your local terminal via the aws cli like
aws s3 cp
?
v
No particular reason, but I think that will give an error as well. Yes I can do it with aws cli - the credentials work. I have been trying to understand how to pass the service account to flyte backend so that flytefile actually sends them to remote.
if I use boto3 in flyte tasks (and run on local with os.environ), it will succeed but that was not the goal of my exercise
j
Yea for running it remotely you need to annotate the service account you are using with an IAM role. You had a look here already? https://github.com/davidmirror-ops/flyte-the-hard-way
v
Not yet, it's on the to-do list