Hey all, I am trying to access AWS services from t...
# ask-the-community
h
Hey all, I am trying to access AWS services from tasks running in the demo cluster. When I set
AWS_ACCESS_KEY_ID
as environment variable in a task I get the following error (here, I tried it with the example workflow from the documentation, https://docs.flyte.org/en/latest/getting_started/index.html, just adding
os.environ["AWS_ACCESS_KEY_ID"]="secret_value"
to a task). It seems that the environment variable is picked up when artifacts are stored on minio? Does anybody know how to resolve it? What am I missing here?
Copy code
[1/1] currentAttempt done. Last Error: USER::Pod failed. No message received from kubernetes.
[fc0d281b2d8144efcb5d-n0-0] terminated with exit code (1). Reason [Error]. Message: 
cess exited with error code: 1.  Stderr dump:

b'upload failed: ../tmp/flyte-0ws144nx/sandbox/local_flytekit/engine_dir/error.pb to <s3://my-s3-bucket/metadata/propeller/flytesnacks-development-fc0d281b2d8144efcb5d/n0/data/0/error.pb> An error occurred (AccessDenied) when calling the PutObject operation: Access Denied.\n'
Traceback (most recent call last):
  File "/usr/local/bin/pyflyte-fast-execute", line 8, in <module>
    sys.exit(fast_execute_task_cmd())
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/flytekit/bin/entrypoint.py", line 507, in fast_execute_task_cmd
    subprocess.run(cmd, check=True)
  File "/usr/local/lib/python3.9/subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['pyflyte-execute', '--inputs', '<s3://my-s3-bucket/metadata/propeller/flytesnacks-development-fc0d281b2d8144efcb5d/n0/data/inputs.pb>', '--output-prefix', '<s3://my-s3-bucket/metadata/propeller/flytesnacks-development-fc0d281b2d8144efcb5d/n0/data/0>', '--raw-output-data-prefix', '<s3://my-s3-bucket/test/hx/fc0d281b2d8144efcb5d-n0-0>', '--checkpoint-path', '<s3://my-s3-bucket/test/hx/fc0d281b2d8144efcb5d-n0-0/_flytecheckpoints>', '--prev-checkpoint', '""', '--dynamic-addl-distro', '<s3://my-s3-bucket/b6/flytesnacks/development/A7HCCVU2345H3DD7M6S5QIAJ2U======/scriptmode.tar.gz>', '--dynamic-dest-dir', '/root', '--resolver', 'flytekit.core.python_auto_container.default_task_resolver', '--', 'task-module', 'example_workflow', 'task-name', 'generate_normal_df']' returned non-zero exit status 1.
.
k
You will have to export at module level
We prefer using service account and Iam role
Env vars are dangerous as they can leak in your code
To explain, task is only loaded and executed after data is downloaded
h
Thanks for your answer. I thought for some local tests using the demo cluster, environment variables would do the job. Is it possible to use minio as storage for the demo cluster and access AWS services with separate credentials? It seems that if I export at the module level minio requests use AWS credentials from the environment variable, so it breaks before the task execution is started.
k
ohh you are right you can also use S3 just like minio 🙂. I realized this now
and then all your comms will happen with S3
with these keys
and drop
FLYTE_AWS_ENDPOINT: "<http://minio.flyte:9000>"
you might want to use
flytectl sandbox start
@Hanno Küpers let me know if this helps
h
So with these settings Flyte backend would use S3 instead of minio, is that right? I actually wanted to use Flyte with minio for the backend but being able to connect to AWS S3 in a task using the credentials set as environment variable. It just seems to interfere with each other. It is only for a simple demo using an existing codebase, so I do not want to create new dedicated buckets for the demo cluster. Still trying to understand how it all works and if we can use Flyte for our purposes.
k
Now I get it, you can I face use it. You can simply add additional env vars, as you can see the AWS creds are not used. I don't think we have done this but it should work
The flytefile etc will end up in minio as that's the default setup and you can manually push to AWS s3
Checkout the persistence layer in flytekit and you could tweak things to support both in the future
171 Views