https://flyte.org logo
#ask-the-community
Title
# ask-the-community
s

seunggs

09/16/2022, 5:33 PM
So I’m stuck on S3 access - I created a service account called
flyte-executor
with the
flyte-user-role
(which has full s3 access) attached as an annotation and running Flyte executions with this service account, but it’s giving me PutObject access denied error. This service account is in the project+domain namespace. What am I doing wrong?
y

Yee

09/16/2022, 9:07 PM
stupid question but the iam role has the putobject permission right?
this is annoying to do I apologize, but maybe we can also try to run the pod manually, but instead of the pyflyte command run instead
aws sts get-caller-identity
s

seunggs

09/16/2022, 9:24 PM
flyte-user-role has
AmazonS3FullAccess
policy attached to it
How can I run the pod manually?
I see the pod for the task that failed but the status is failed so I can’t exec into it
When I log the pod, I see some other errors as well: for example,
fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden
@Yee Should I maybe add
sleep infinity
to the args in the yaml and
exec
into it to run
aws sts get-caller-identity
? Let me know when you got a minute - thanks!
s

Samhita Alla

09/20/2022, 4:56 AM
cc: @Yee
y

Yee

09/20/2022, 6:11 PM
hey sorry is this still an issue @seunggs
s

seunggs

09/20/2022, 7:54 PM
Yes - can you provide me a bit more detail on what to do to start debugging why I have s3 permission issues?
Thanks! @Yee
y

Yee

09/20/2022, 8:02 PM
if you want to run the pod manually you have to create a yaml file, but yeah this will work
create the pod, make sure you have the correct service account, remove the status field, ownership and anything else that looks like it might be related to an existing flyteworkflow execution
update the args to sleep infinity and then yeah you can attach to it and investigate.
s

seunggs

09/20/2022, 8:03 PM
OK so create a completely new pod
I’ll try it tonight and report back - thank you!
y

Yee

09/20/2022, 8:03 PM
i’m not convinced that’s necessary though. are you sure the service account has the iam role attached
maybe easier to first try to assume the role manually just on your laptop through the aws cli
run the same s3 command
just isolate out the aws side first, and then deal with the k8s side
s

seunggs

09/20/2022, 8:05 PM
Hmm ok - here’s my sa yaml that’s deployed
Copy code
apiVersion: v1
imagePullSecrets:
- name: gcr-json-key
kind: ServiceAccount
metadata:
  annotations:
    <http://eks.amazonaws.com/role-arn|eks.amazonaws.com/role-arn>: arn:aws:iam::xxx:role/flyte-user-role
  labels:
    <http://app.kubernetes.io/managed-by|app.kubernetes.io/managed-by>: pulumi
  name: flyte-executor
  namespace: shelly-robotics-bipedal-robot-development
  resourceVersion: "57747250"
  uid: 9db3e9da-cf32-4a78-8b06-81d83b66c611
secrets:
- name: flyte-executor-token-l6rkj
As you can see the annotation is there - unless that’s not sufficient to link an SA to an iam role
y

Yee

09/20/2022, 8:22 PM
can you
get pod <pod name> -o yaml
and grep for “iam”
you should see aws-iam-token a couple times
and if that’s the case then yeah, can you try assuming the role locally
s

seunggs

09/20/2022, 8:35 PM
So
get pod <pod name> -o yaml | grep 'iam'
returns this
Copy code
value: arn:aws:iam::xxx:role/flyte-user-role
      name: aws-iam-token
  - name: aws-iam-token
Does this look right?
y

Yee

09/20/2022, 8:35 PM
Yeah
Can you try assuming locally?
s

seunggs

09/20/2022, 8:38 PM
When I try
aws sts assume-role …
locally, I get this error:
An error occurred (AccessDenied) when calling the AssumeRole operation
y

Yee

09/20/2022, 8:39 PM
mm
okay sure, try copying the pod yaml and editing so that the command is just a sleep
maybe your role is not set up for you to assume it
s

seunggs

09/20/2022, 8:47 PM
Hmm - I have admin access to all resources. I do have multiple clusters running, one of which is flyte and here’s the trusted relationships:
Copy code
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::xxx:oidc-provider/oidc.eks.us-west-1.amazonaws.com/id/2A6739B7813451087E3258C60BC37CF4"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "<http://oidc.eks.us-west-1.amazonaws.com/id/2A6739B7813451087E3258C60BC37CF4:aud|oidc.eks.us-west-1.amazonaws.com/id/2A6739B7813451087E3258C60BC37CF4:aud>": "<http://sts.amazonaws.com|sts.amazonaws.com>"
        }
      }
    },
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": "<http://ec2.amazonaws.com|ec2.amazonaws.com>"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
Where oidc is from flyte eks cluster
y

Yee

09/20/2022, 8:51 PM
what do you mean?
s

seunggs

09/20/2022, 8:51 PM
Oh I was responding to your comment regarding the role not being setup to be assumed
I think a pod in the flyte cluster should be able to assume this role via service account
y

Yee

09/20/2022, 8:52 PM
yeah try that i guess, just make the pod yaml
let us know if you need an example
s

seunggs

09/20/2022, 8:54 PM
OK would it be fine to just get rid of the status from the yaml I get from the failed task pod? And then update the args to
sleep infinity
?
OK I just changed the pod name and created it and ran
aws sts get-caller-identity
and here’s the response:
Copy code
{
  "UserId": "xxx:botocore-session-1663707573",
  "Account": "xxx",
  "Arn": "arn:aws:sts::xxx:assumed-role/flyte-user-role/botocore-session-1663707573"
}
y

Yee

09/20/2022, 9:00 PM
can you try to run the failing s3 command also?
s

seunggs

09/20/2022, 9:01 PM
Hmm where can I grab this command?
You mean the pyflyte command?
y

Yee

09/20/2022, 9:01 PM
no it was in the error
putobject access denied error
just
Copy code
cat > abc
hello
^C
and then try to aws s3 cp abc s3:/….
s

seunggs

09/20/2022, 9:02 PM
ok one sec
No error - confirmed that the file was uploaded to the right bucket path in s3
aws s3 cp abc <s3://sidetrek-flyte-cluster-flyte-bucket/metadata/propeller/shelly-robotics-bipedal-robot-development-an6gvhl5dn8vr44nn9ds/n0/data/0/abc.txt>
y

Yee

09/20/2022, 9:05 PM
but when the task runs with the same role, it can’t?
s

seunggs

09/20/2022, 9:05 PM
Yes that seems to be the case
One thing though - maybe I’m doing it wrong
So when I launch the execution in the flyte dashboard
I only fill out Kubernetes Service Account in the
Security Context
section - leaving IAM Role field empty
Could that be what’s creating the problem??
Do I need to fill out both? I assumed service account is enough since it references the iam role in it
y

Yee

09/20/2022, 9:23 PM
no one should be enough
s

seunggs

09/20/2022, 9:24 PM
Oh ok then it’s strange
y

Yee

09/20/2022, 9:24 PM
can you change your task code to call sts get caller identity
s

seunggs

09/20/2022, 9:24 PM
I seem to be able to run s3 commands manually in the pod fine but it errors out when I run it from dashboard
s

seunggs

09/20/2022, 9:25 PM
You mean replace the actual code of the failing task with just this?
Copy code
>>> import boto3
>>> boto3.client('sts').get_caller_identity().get('Account')
y

Yee

09/20/2022, 9:25 PM
yeah
in the body of the task
s

seunggs

09/20/2022, 9:26 PM
OK
Hi @Yee I had a chance to run this and I get the same errors: a whole bunch of HeadObject 403 forbidden errors and PutObject access denied errors like this:
Copy code
{
  "asctime": "2022-09-22 00:01:44,624",
  "name": "flytekit",
  "levelname": "ERROR",
  "message": "Exception when trying to execute ['aws', '--endpoint-url', '<http://minio.flyte:9000>', 's3', 'cp', '--recursive', '--acl', 'bucket-owner-full-control', '/tmp/flyte-oz6o659c/sandbox/local_flytekit/engine_dir', '<s3://sidetrek-flyte-cluster-flyte-bucket/metadata/propeller/shelly-robotics-bipedal-robot-development-aq44hcpb7rdhwxpw22k9/n0/data/3>'], reason: Called process exited with error code: 1.  Stderr dump:\n\nb'upload failed: ../tmp/flyte-oz6o659c/sandbox/local_flytekit/engine_dir/error.pb to <s3://sidetrek-flyte-cluster-flyte-bucket/metadata/propeller/shelly-robotics-bipedal-robot-development-aq44hcpb7rdhwxpw22k9/n0/data/3/error.pb> An error occurred (AccessDenied) when calling the PutObject operation: Access Denied.\\n'"
}
So it looks like it doesn’t even get to running the task code - bunch of fatal errors due to s3 issue
s

Samhita Alla

09/23/2022, 4:46 AM
Are you still seeing the issue, @seunggs?
s

seunggs

09/23/2022, 8:00 PM
Hey @Samhita Alla thanks for checking up on this - I was just on call with @Yee who helped me resolve it. It was my mistake with helm chart values setup where it was still using minio that was causing the problem. Thank you!
171 Views