acoustic-carpenter-78188
09/11/2023, 3:39 PMsecret_requests=[
Secret(
group=SECRET_GROUP,
key=SECRET_FLYTE_APP_SECRET,
mount_requirement=Secret.MountType.FILE),
Secret(
group=SECRET_GROUP,
key=SECRET_IM,
mount_requirement=Secret.MountType.FILE)
])
When added to a ContainerTask decorator, the task registers and executes fine with pyflyte register - but the secrets are not available either via FILE (/var/...) nor as an ENV_VAR (/etc/secrets/...)
Elastic Plugin
Given a task configuration like:
@flytekit.task(
...
task_config=Elastic(
nnodes=2,
nproc_per_node=8,
rdzv_configs={"timeout": 1200, "join_timeout": 900}
),
...
)
An error like the following is raised:
Root Cause (first observed failure):
[0]:
time : 2023-08-14_15:54:34
host : f9944ebe99b184ee293a-n1-0-worker-0
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 254)
error_file: /tmp/torchelastic_xc62kj37/f9944ebe99b184ee293a_78vmhc5o/attempt_0/0/error.json
traceback : Traceback (most recent call last):
File "/opt/conda/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
return f(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/flytekitplugins/kfpytorch/task.py", line 233, in spawn_helper
return_val = fn(**kwargs)
File "/root/fine_tuning/llm_fine_tuning.py", line 422, in train
os.environ["WANDB_API_KEY"] = flytekit.current_context().secrets.get(
File "/opt/conda/lib/python3.10/site-packages/flytekit/core/context_manager.py", line 365, in get
raise ValueError(
ValueError: Unable to find secret for key wandb_api_key-5t1ZwJ in group arn:aws:secretsmanager:us-east-2:590375264460:secret: in Env Var:_FSEC_ARN:AWS:SECRETSMANAGER:US-EAST-2:590375264460:SECRET:_WANDB_API_KEY-5T1ZWJ and FilePath: /etc/secrets/arn:aws:secretsmanager:us-east-2:590375264460:secret:/wandb_api_key-5t1zwj
============================================================
User error.
Expected behavior
The expected behavior is htat if a secret is mounted as a File, that the file will be avialable in /var/secrets/...
If the secret is mounted as an ENV_VAR, it should be already added to the environment and directly accessable (essentially echo $MY_SECRET_VAR should return the value of MY_SECRET_VAR)
Additional context to reproduce
create a simple ContainerTask, have it sleep 2h
as entrypoint
Add secrets to it (see above example, make sure credentials are configured correctly)
register the ContainerTask to flyte cluster / sandbox
ssh into the container, check /etc/secrets / /var/secrets
Screenshots
No response
Are you sure this issue hasn't been raised already?
☑︎ Yes
Have you read the Code of Conduct?
☑︎ Yes
flyteorg/flyteacoustic-carpenter-78188
09/11/2023, 3:39 PM