Hey, what is the preferred way of providing projec...
# ask-the-community
h
Hey, what is the preferred way of providing project specific IAM Roles to a workflow? I am trying to run a workflow using an IAM role with permissions suitable for only that project. In my launch plan I enter some role e.g., as in the picture below but it doesn’t seem to be picked up when I describe the Pod. Does your default role e.g.,
eks-flyte-user-rule
assume the IAM Role you provide in the launch plan, or is it assumed by some service account that gets created?
Copy code
Environment:
      ...
      AWS_ROLE_ARN:                       arn:aws:iam::<account_id>:role/eks-flyte-user-role
So basically the Flyte User Role ends up being assumed by default. Should I create a new role that the flyte user role will assume with the correct permissions, or do you prefer creating a service account and let the SA assume the project role?
p
So we currently dont have project level roles but they are defined at project-domain level and come from the values file here https://github.com/flyteorg/flyte/blob/76e865ae46c7fd391cfd3903a5aa860e64575a81/charts/flyte-core/values-eks.yaml#L385 If you have provided full s3 permissions to that role then the pods should be able to use that role when flyte launches the user pods. Also assuming you have done trust relationship for those roles aswell https://docs.flyte.org/en/latest/deployment/aws/manual.html#oidc-provider-for-the-eks-cluster What is the error you get ?
h
I am not getting any errors, I am just wondering how I should handle different role per (project, domain). Say a developer wants to create a new project with Flyte, then the developer should use a role with specific permissions for the project. The way forward here is to create a IAM role with trust relationship for OIDC authentication with AWS and then override it in the Launch Plan?
Since there is an option to provide both IAM Role and Service Account, I was wondering what you considered to be the best practice here
Copy code
auth_role = AuthRole(assumable_iam_role="my:iam:role")
launch_plan.LaunchPlan.get_or_create(
    workflow=wf,
    name="your_lp_name_3",
    auth_role=auth_role,
)
In the above example, is the IAM role assumed by the
flyte-user-role
i.e., I need additional trust relationship, or is the
flyte-user-role
overriden with
my:iam:role
?
p
You can add it to the launchplan, or you can add this during the execution creation time when you launch it from the console by clicking the
Advanced Options
or you can provide this at project-domain level using flytectl if you are using the latest admin and flytectl (docs are in review for this https://github.com/flyteorg/flytectl/pull/316)
Copy code
my:iam:role
will override the flyte-user-role
So the order is • directly on the execution request • then on launchplan • flytectl created matchable attribute (PR docs above) • application defaults from values yaml file
Regarding whats the best practise i have seen users use both . But i will defer to @Haytham Abuelfutuh to answer whats considered to be the best practise
h
It seems in the docs that you specify the service account rather than IAM role, but maybe you can do a similar operation for the IAM role?
Copy code
security_context:
	  run_as:
		k8s_service_account: default
p
Yes . that example only shows k8s service account but you can do for iam role too
Also use securitycontext instead of AuthRole as its deprecated
h
So I tried to create a role and specify it directly in the launch plan through the UI. However, when I describe the pod it still uses the default role
p
Are you using security context to specify in launchplan . can you share your launch plan spec using flytectl get launch plan -p porject -d domain <name> --latest -o yaml Also which admin version are you using
h
I am using the 0.19.4 Helm release
Copy code
~  flytectl get launchplan -p hackday -d development --latest -o yaml
- closure:
    createdAt: "2022-04-25T09:23:55.415089Z"
    expectedInputs: {}
    expectedOutputs: {}
    updatedAt: "2022-04-25T09:23:55.415089Z"
  id:
    domain: development
    name: flyte.workflows.workflow.my_wf
    project: hackday
    resourceType: LAUNCH_PLAN
    version: v0.0.5
  spec:
    annotations: {}
    defaultInputs: {}
    entityMetadata: {}
    fixedInputs: {}
    labels: {}
    rawOutputDataConfig: {}
    workflowId:
      domain: development
      name: flyte.workflows.workflow.my_wf
      project: hackday
      resourceType: WORKFLOW
      version: v0.0.5
- closure:
    createdAt: "2022-04-08T13:17:18.634988Z"
    expectedInputs: {}
    expectedOutputs: {}
    updatedAt: "2022-04-08T13:17:18.634988Z"
  id:
    domain: development
    name: flyte.workflows.workflow.my_wf
    project: hackday
    resourceType: LAUNCH_PLAN
    version: v0.0.4
  spec:
    annotations: {}
    authRole: {}
    defaultInputs: {}
    entityMetadata: {}
    fixedInputs: {}
    labels: {}
    rawOutputDataConfig: {}
    workflowId:
      domain: development
      name: flyte.workflows.workflow.my_wf
      project: hackday
      resourceType: WORKFLOW
      version: v0.0.4
- closure:
    createdAt: "2022-04-08T09:47:03.133154Z"
    expectedInputs: {}
    expectedOutputs: {}
    updatedAt: "2022-04-08T09:47:03.133154Z"
  id:
    domain: development
    name: flyte.workflows.workflow.my_wf
    project: hackday
    resourceType: LAUNCH_PLAN
    version: v0.0.3
  spec:
    annotations: {}
    authRole: {}
    defaultInputs: {}
    entityMetadata: {}
    fixedInputs: {}
    labels: {}
    rawOutputDataConfig: {}
    workflowId:
      domain: development
      name: flyte.workflows.workflow.my_wf
      project: hackday
      resourceType: WORKFLOW
      version: v0.0.3
- closure:
    createdAt: "2022-04-08T09:40:20.073883Z"
    expectedInputs: {}
    expectedOutputs: {}
    updatedAt: "2022-04-08T09:40:20.073883Z"
  id:
    domain: development
    name: flyte.workflows.workflow.my_wf
    project: hackday
    resourceType: LAUNCH_PLAN
    version: v0.0.2
  spec:
    annotations: {}
    authRole: {}
    defaultInputs: {}
    entityMetadata: {}
    fixedInputs: {}
    labels: {}
    rawOutputDataConfig: {}
    workflowId:
      domain: development
      name: flyte.workflows.workflow.my_wf
      project: hackday
      resourceType: WORKFLOW
      version: v0.0.2
- closure:
    createdAt: "2022-04-08T09:19:29.014621Z"
    expectedInputs: {}
    expectedOutputs:
      variables:
        o0:
          description: o0
          type:
            simple: STRING
    updatedAt: "2022-04-08T09:19:29.014621Z"
  id:
    domain: development
    name: flyte.workflows.workflow.my_wf
    project: hackday
    resourceType: LAUNCH_PLAN
    version: v0.0.1
  spec:
    annotations: {}
    authRole: {}
    defaultInputs: {}
    entityMetadata: {}
    fixedInputs: {}
    labels: {}
    rawOutputDataConfig: {}
    workflowId:
      domain: development
      name: flyte.workflows.workflow.my_wf
      project: hackday
      resourceType: WORKFLOW
      version: v0.0.1
I seem to be getting all the different versions of the launch plan. I have not entered the authRole in the LP at any given point, only tried to override it in the UI
p
Can you try the latest beta release and see if you are facing the same issue https://github.com/flyteorg/flyte/releases/tag/v1.0.0-b1
h
Same thing when I use the `cr.flyte.org/flyteorg/flyteadmin:v0.6.149`admin image @Prafulla Mahindrakar
I.e., I pass the role here
p
Thanks Hampus. Let me try to check more on this
So the issue seems to be that the IAM role arn passed from UI or flyteconsole gets correctly passed to the pod in the annotations field for the pod But the AWS_ROLE_ARN Env variable that exists for the pod keeps using the flyte-user-role This seems incorrect and also for a non-existent role that get passed in the UI/cli the pod is able to do S3 operations which we are assuming happens due to flyte-user-role env variable .
h
Yes, so basically if we provide an IAM Role for the Launch Plan either by passing it through flyte console or by using
flytectl
it gets picked up in the Pod annotations, but it seems like it is using the `flyte-user-role`set in the environment variables. E.g., I can pass
iamRoleArn: asdf
and I am still able to interact with AWS resources as per my
flyte-user-role
IAM policy.
k
hey @Hampus Rosvall do you mind fetching the execution using flytectl so we can inspect the security context there?
flytectl get execution -p flytesnacks -d development <name> -o yaml
h
@katrina sure, using 0.19.4 release?
k
yup that works
p
I tried it on
<http://ghcr.io/flyteorg/flyteadmin-release:v1.0.0-b1|ghcr.io/flyteorg/flyteadmin-release:v1.0.0-b1>
With an unknown IamRole
Copy code
k get pods -n flytesnacks-development  avj9vsrl8qxx8bmbz9mv-n0-0 -o yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    <http://cluster-autoscaler.kubernetes.io/safe-to-evict|cluster-autoscaler.kubernetes.io/safe-to-evict>: "false"
    <http://eks.amazonaws.com/role-arn|eks.amazonaws.com/role-arn>: abcdefgh
    <http://kubernetes.io/psp|kubernetes.io/psp>: eks.privileged
Execution failed rightly for me
Copy code
Exception: Called process exited with error code: 1.  Stderr dump:

          b'upload failed: ../tmp/flyte/local_flytekit/ae64be4b354052b1124d2fff213d9654/engine_dir/error.pb to <s3://flyte-demo/metadata/propeller/flytesnacks-development-avj9vsrl8qxx8bmbz9mv/n0/data/0/error.pb> An error occurred (AccessDenied) when calling the PutObject operation: Access Denied\n'
        reason: Error
        startedAt: "2022-04-25T17:10:00Z"
h
@katrina
Copy code
flytectl get execution -p hackday -d development f0dbbc7271146457a8d5 -o yaml
closure:
  createdAt: "2022-04-25T14:12:08.553786193Z"
  duration: 46.179998699s
  outputs:
    uri: s3://<bucket>/metadata/propeller/hackday-development-f0dbbc7271146457a8d5/end-node/data/0/outputs.pb
  phase: SUCCEEDED
  startedAt: "2022-04-25T14:12:13.666273249Z"
  stateChangeDetails:
    occurredAt: "2022-04-25T14:12:08.553786193Z"
  updatedAt: "2022-04-25T14:12:59.846271699Z"
  workflowId:
    domain: development
    name: flyte.workflows.workflow.my_wf
    project: hackday
    resourceType: WORKFLOW
    version: v0.0.5
id:
  domain: development
  name: f0dbbc7271146457a8d5
  project: hackday
spec:
  authRole:
    assumableIamRole: arn:aws:iam::asdasd
  launchPlan:
    domain: development
    name: flyte.workflows.workflow.my_wf
    project: hackday
    resourceType: LAUNCH_PLAN
    version: v0.0.5
  metadata:
    systemMetadata: {}
  securityContext:
    runAs:
      iamRole: arn:aws:iam::asdasd
So it actually looks correct, but I am downloading some data from S3 in the task which runs successfully which makes me think it is using another role, or am I missing something?
@Prafulla Mahindrakar how does your ENV look like?
p
It doesn’t have the ENV that you are seeing and i was checking how AWS_ROLE_ARN and other aws env variables show up and seems those are injected ones .
Copy code
memory:  500Mi
    Environment:
      FLYTE_INTERNAL_CONFIGURATION_PATH:  /root/sandbox.config
      FLYTE_INTERNAL_IMAGE:               <http://ghcr.io/flyteorg/flytecookbook:core-773447b298bfa8ecfc2b25983ce1ed2d33753d01|ghcr.io/flyteorg/flytecookbook:core-773447b298bfa8ecfc2b25983ce1ed2d33753d01>
      FLYTE_INTERNAL_EXECUTION_WORKFLOW:  flytesnacks:development:core.basic.lp.go_greet
      FLYTE_INTERNAL_EXECUTION_ID:        avj9vsrl8qxx8bmbz9mv
      FLYTE_INTERNAL_EXECUTION_PROJECT:   flytesnacks
      FLYTE_INTERNAL_EXECUTION_DOMAIN:    development
      FLYTE_ATTEMPT_NUMBER:               0
      FLYTE_INTERNAL_TASK_PROJECT:        flytesnacks
      FLYTE_INTERNAL_TASK_DOMAIN:         development
      FLYTE_INTERNAL_TASK_NAME:           core.basic.lp.greet
      FLYTE_INTERNAL_TASK_VERSION:        773447b298bfa8ecfc2b25983ce1ed2d33753d01
      FLYTE_INTERNAL_PROJECT:             flytesnacks
      FLYTE_INTERNAL_DOMAIN:              development
      FLYTE_INTERNAL_NAME:                core.basic.lp.greet
      FLYTE_INTERNAL_VERSION:             773447b298bfa8ecfc2b25983ce1ed2d33753d01
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-nvnx4 (ro)
Admin has config to set annotation key .can you check if you have that set
Copy code
flyteadmin:
      roleNameKey: "<http://eks.amazonaws.com/role-arn|eks.amazonaws.com/role-arn>"
      profilerPort: 10254
      eventVersion: 2
      metricsScope: "flyte:"
      metadataStoragePrefix:
h
@Prafulla Mahindrakar the
AWS_
env vars are set by the Service Account on EKS (https://docs.aws.amazon.com/eks/latest/userguide/specify-service-account-role.html)
So my guess is that if we override the service account in the security context we will assume the project specific role rather than the default one
I tried to create a Service Account and provide it in the security context in the launch plan, then it get picked up correctly by the Pod
k
great, so the behavior was due to the project-level setting being picked up?
m
hey! I also experienced the same situation today, specifying Role from the console > Advanced Settings adds an annotation to the pod but the AWS_ env vars are not set to the pod. It works when a ServiceAccount(IRSA enabled) is specified instead of an IAM role
h
@katrina I think the problem is that the provided IAM Role in the launch plan never gets assumed by the role in the service account. Not sure how that should be handled, but creating a (project, domain) specific service account is sufficient currently
k
sorry trying to understand, in the example above
iamRole: arn:aws:iam::asdasd
was not correct or expected?
h
No exactly, I was trying to enter an arbitrary role that should fail, but the workflow still ran successfully as it was never assumed.
k
oh sorry, i thought that was redacted 😂 you mean asdasd literally, i see!
😂 1
sounds like something on our end to debug, thank you for walking through this with us!
h
Sure! If possible, can you point me to the code where you parse additional IAM Roles provided in the security context?
k
it's a little convoluted b/c of all the deprecated fields in idl and the overridable layers at which you can set the context, but here's https://github.com/flyteorg/flyteadmin/blob/master/pkg/manager/impl/execution_manager.go#L493:28 & https://github.com/flyteorg/flyteadmin/blob/master/pkg/manager/impl/execution_manager.go#L912,L913 if you want to poke around
p
In my experimentation, I wasn’t able to use the IAM role that gets passed into the pods through annotations and it always depended on what service account was being used by the pod , And the service account should be annotated with the right IAM role so that the aws cli command being used by the container perform the s3 operations successfully. @katrina do we know if this ever worked or does it require some additional setup in the cluster or may be i am missing something. I tried creating a sample test pod with annotations and service account manually and performed s3 operations from within the container. My observations are in line that the pod isn’t using the annotation and use the service account
Copy code
kubectl exec -it test-praf -n flytesnacks-development -- /bin/bash
root@test-praf:~# 
root@test-praf:~# 
root@test-praf:~# 
root@test-praf:~# touch a
root@test-praf:~# aws s3 cp a <s3://flyte-demo/metadata/propeller/flytesnacks-development-akqfpd7n4b8lh78m9c5r/n0/data/0/a>
upload failed: ./a to <s3://flyte-demo/metadata/propeller/flytesnacks-development-akqfpd7n4b8lh78m9c5r/n0/data/0/a> An error occurred (AccessDenied) when calling the PutObject operation: Access Denied
The pod has the following
Copy code
k get pod -n flytesnacks-development test-praf -o yaml            
apiVersion: v1
kind: Pod
metadata:
  annotations:
    <http://iam.amazonaws.com/role|iam.amazonaws.com/role>: arn:aws:iam::590375264460:role/eksctl-flyte-demo-2-addon-iamserviceaccount-Role1-11QUDNRU7X84P
   ....
  name: test-praf
  namespace: flytesnacks-development
....
  serviceAccount: default
  serviceAccountName: default
Now if i use it with service account annotated with IAM role which has permissions 1NRJQGB2NSHL9 , then AWS_ROLE_ARN is exported with this annotated role and it doesn’t matter what additional annotation are on the pod related to role which are simply ignored
Copy code
k get pod -n flytesnacks-development test-praf -o yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    <http://iam.amazonaws.com/role|iam.amazonaws.com/role>: arn:aws:iam::590375264460:role/eksctl-flyte-demo-2-addon-iamserviceaccount-Role1-11QUDNRU7X84P

....
    - name: AWS_ROLE_ARN
      value: arn:aws:iam::590375264460:role/eksctl-flyte-demo-2-addon-iamserviceaccount-Role1-1NRJQGB2NSHL9
    - name: AWS_WEB_IDENTITY_TOKEN_FILE
      value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
....
serviceAccount: demo
  serviceAccountName: demo
And it allows me to copy to s3 using this annotated rrole.
Copy code
kubectl exec -it test-praf -n flytesnacks-development -- /bin/bash
root@test-praf:~# touch a
root@test-praf:~# aws s3 cp a <s3://flyte-demo/metadata/propeller/flytesnacks-development-akqfpd7n4b8lh78m9c5r/n0/data/0/a>
upload: ./a to <s3://flyte-demo/metadata/propeller/flytesnacks-development-akqfpd7n4b8lh78m9c5r/n0/data/0/a>
root@test-praf:~# cat ~/.aws/cli/cache/574579698c8d6815d0805a57d875c58fa630e77a.json 
{"Credentials": {"AccessKeyId": "...", ..., "Expiration": "2022-04-26T13:52:59Z"}, "SubjectFromWebIdentityToken": "system:serviceaccount:flytesnacks-development:demo", "AssumedRoleUser": {"AssumedRoleId": "AROAYS5I3UDGCIBQWKOCJ:botocore-session-1650977579", "Arn": "arn:aws:sts::590375264460:assumed-role/eksctl-flyte-demo-2-addon-iamserviceaccount-Role1-1NRJQGB2NSHL9/botocore-session-1650977579"}, "Provider": "arn:aws:iam::590375264460:oidc-provider/oidc.eks.us-east-2.amazonaws.com/id/46B254ACC1AC1B23CCCA2973F62AB323", "Audience": "<http://sts.amazonaws.com|sts.amazonaws.com>", "ResponseMetadata": {"RequestId": "b9301edc-ce4f-42ff-a049-4cb09936c5c4", "HTTPStatusCode": 200, "HTTPHeaders": {"x-amzn-requestid": "b9301edc-ce4f-42ff-a049-4cb09936c5c4", "content-type": "text/xml", "content-length": "1975", "date": "Tue, 26 Apr 2022 12:52:59 GMT"}, "RetryAttempts": 0}}
After discussion have created this issue https://github.com/flyteorg/flyte/issues/2417 So basically in your case you should only be using the service account which has the right annotated roles.
👍 2
171 Views