I've checked that the account I configured in the ...
# flyte-deployment
s
I've checked that the account I configured in the "storage:" section of the configuration file has has read/write permissions in the s3 bucket, so I'm not sure why this is happening. Could this be related to the default service account used to run the workflow?
a
hey @shy-morning-17240 I don't think so because it works for other tasks
s
@average-finland-92144 Do you know where in the Flyte-Core helm config file I would configure the storage authentication info used by the flyte workflow pod to write the outputs of the workflow? I saw in flyte the hard way that they do this by setting "FLYTE_AWS_ACCESS_KEY_ID" and "FLYTE_AWS_SECRET_ACCESS_KEY", but this didn't workout for me (not sure if you set a different environment variable when using "custom/stow" type storage)
a
I think based on how we do it for azure (ref) plus what's expected by the client (ref) you can build the section. I hope it helps and whatever you find let us know
s
@average-finland-92144 Thanks, things line-up pretty well with those scripts. After some further debugging, it seems like simpler task outputs (like python primitives such as lists and dictionaries) are successfully written to s3 bucket, however, only when I'm trying to return a FlyteFile from a task (in this case a pytorch checkpoint stored in the task's local storage) do I get this weird access denied error. I figured maybe the file was being created with too restrictive rwx permissions, so I changed the file permissions to 777 before trying to return it again, but still got the same error. I then wondered if maybe the issue was how torch.save() was creating the file, so I decided to create a plain-text file and return that instead. To my surprise, this worked, my task completed successfully and I was able to find the file in my s3 storage. Do you have any idea why this could be happening with torch.save() (maybe something to do with changes to how torch.save works vs what flyte can support?). I've copied example like this one to return a pytorch checkpoint from a task using torch.save() to create the checkpoint, but always get the permission denied error. Also, let me know if I should move this question to another channel, as this is definitely no longer a deployment issue.
a
@shy-morning-17240 Please file an Issue so we can explore this better