Hi all, We've deployed flyte-binary on K8s on Digi...
# flyte-deployment
g
Hi all, We've deployed flyte-binary on K8s on Digital Ocean (if anyone has any specific tips about that - we are happy to hear). Deployment seems to have been successful. The primary issue is that the task itself isn't being executed because the flytekit code is unable to locate credentials to download the code from S3:
Copy code
│ /usr/local/lib/python3.10/site-packages/s3fs/core.py:336 in get_s3           │
│                                                                              │
│ ❱  336 │   │   │   return await self._s3creator.get_bucket_client(bucket)    │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/s3fs/utils.py:39 in                  │
│ get_bucket_client                                                            │
│                                                                              │
│ ❱  39 │   │   │   response = await general_client.head_bucket(Bucket=bucket_ │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/aiobotocore/client.py:366 in         │
│ _make_api_call                                                               │
│                                                                              │
│ ❱ 366 │   │   │   http, parsed_response = await self._make_request(          │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/aiobotocore/client.py:391 in         │
│ _make_request                                                                │
│                                                                              │
│ ❱ 391 │   │   │   return await self._endpoint.make_request(                  │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/aiobotocore/endpoint.py:96 in        │
│ _send_request                                                                │
│                                                                              │
│ ❱  96 │   │   request = await self.create_request(request_dict, operation_mo │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/aiobotocore/endpoint.py:84 in        │
│ create_request                                                               │
│                                                                              │
│ ❱  84 │   │   │   await self._event_emitter.emit(                            │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/aiobotocore/hooks.py:66 in _emit     │
│                                                                              │
│ ❱ 66 │   │   │   response = await resolve_awaitable(handler(**kwargs))       │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/aiobotocore/_helpers.py:15 in        │
│ resolve_awaitable                                                            │
│                                                                              │
│ ❱ 15 │   │   return await obj                                                │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/aiobotocore/signers.py:24 in handler │
│                                                                              │
│ ❱  24 │   │   return await self.sign(operation_name, request)                │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/aiobotocore/signers.py:82 in sign    │
│                                                                              │
│ ❱  82 │   │   │   auth.add_auth(request)                                     │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/botocore/auth.py:418 in add_auth     │
│                                                                              │
│ ❱ 418 │   │   │   raise NoCredentialsError()                                 │
╰──────────────────────────────────────────────────────────────────────────────╯
NoCredentialsError: Unable to locate credentials
The cluster itself seems to be configured correctly because we do see metadata and files created by the admin in the said container. I am wondering how's the S3 credentials are expected to be made available to the container running task?
c
we use a
PodTemplate
. in flyte-binary values.yaml we have something like:
Copy code
plugins:
      k8s:
        metadata:
          namespace: flyte-development
        default-pod-template-name: task-pod-template
and then
task-pod-template.yaml
has:
Copy code
apiVersion: v1
kind: PodTemplate
metadata:
  name: task-pod-template
  namespace: flyte-development
template:
  spec:
    imagePullSecrets:
      - name: acr-pw-2
    containers:
      - name: default
        image: "<http://ghcr.io/flyteorg/flytekit:py3.11-sqlalchemy-1.10.3b4|ghcr.io/flyteorg/flytekit:py3.11-sqlalchemy-1.10.3b4>"
    serviceAccountName: workload-identity-development-sa
then it's a matter of configuring that service account with the aws IAM
g
We are using DigitalOcean, so not sure they have the same concept of workload identity. I'm not a DevOps person so maybe (probably) I'm wrong. But the bottom line is, that there's no mechanism in Flyte to pass the storage credentials.
c
you can use the above pattern to inject task pod env vars as well. i know the azure stow client uses a pattern:
Copy code
env:
          - name: AZURE_STORAGE_ACCOUNT_NAME
that works if you want to manage the account name/key in k8 secrets
there's probably something similar for s3
g
Yes, so just a custom solution basically...
c
custom how? i think the stow s3 client will look for credentials matching a certain pattern
g
Right, but anyway i configure values for a storage account for the flyte cluster, but need to replicate that specifically for the pod to have access. The helm takes care of setting it up for the flyte-binary pods but not for the task pods.
I had hoped it will set it up for the task pods as well
c
i think i'm still missing something, sry. you can inject them in the task pod as envvars with either a
PodTemplate
or directly in the
plugins
section in `values.yaml`:
Copy code
plugins:
      k8s:
        metadata:
          namespace: flyte-development
        default-pod-template-name: task-pod-template
        default-env-vars:
          - FLYTE_AZURE_STORAGE_ACCOUNT_NAME: account
then stow uses those envvar credentials to authenticate the task pod against whichever storage provider you are using, as long as it has an implementation (s3, gcp, azure)
g
Thanks for your patience @Chris Grass! I don't think you miss anything. I hoped to need to configure the storage access credentials in one place (like i configured it in value.yaml which was used for the flyte binary). Now i understand i have to configure it separately for the task executors
Just to be clear, I'm still a little confused - why would I need to configure S3 access AGAIN when it's already configured in my values.yaml under
configuration.storage.userDataContainer
?
c
cool, glad it's working. i can't speak to the design decision, but in our scenario (azure) a different credential is used for the task pods because it doesn't need the elevated privs required by the flyte-binary pods.
g
Unfortunately, I didn't say it's working 🙂 Maybe I'm missing something, but it seems like almost everyone who deploys Flyte will need this. For it to work, I will need to make sure "boto" (which seems like what flytekit is using to download the files from the bucket) is properly configured. Maybe create a specialized PodTemplate. Seems like a lot of work for something basic. Again, I'm just emphasizing my understanding, to make sure I don't do a lot of work for nothing. After an exhaustive search, I found the documentation for the K8s plugin configuration values here. It wasn't straightforward...
c
yeah, i agree it would be nice to improve the k8 plugin documentation. this proposal will probably help reduce the task auth complexity. but in the meantime, does adding the following config to flyte-binary
values.yaml
work?
Copy code
plugins:
      k8s:
        metadata:
          namespace: flyte-az-development
        default-env-vars:
          - FLYTE_AWS_ACCESS_KEY_ID: key
          - FLYTE_AWS_SECRET_ACCESS_KEY: secret
          - FLYTE_AWS_ENDPOINT: endpoint
i haven't used the feature for aws, but it works for azure
g
I haven't tried it yet. Why did you add the FLYTE prefix? Anyway, i will keep investigating. Unless someone else has done it.
c
that's how flytekit interprets envvars.
LegacyConfigEntry
:
Copy code
def get_env_name(self) -> str:
        return f"FLYTE_{self.section.upper()}_{self.option.upper()}"
it might work without as well
g
Do you know where the code that spins up the pod sits?
Because I probably need those env vars available as environment for the boto client
c
i'm not familiar with the boto client, but we have deployed using s3 and azure buckets with config similar to above. i don't think secondary config is required
and sorry, i don't know exactly how the task pods get spun up. i'm a relatively new user to k8 and this project
g
Got it, ill give it a shot, thanks!!