Is there a mechanism that we can pass a global key...
# ask-the-community
m
Is there a mechanism that we can pass a global key-value map (eg. config/context map) when executing a launchplan, such that it is then accessible to all the children tasks? One such impl from our side might look like: • Pass the global key-value map explicitly as input to the workflow • First task of the workflow will write that to a global external persistence eg. S3 • Subsequent tasks can then read the map from that peristence Question: instead of our external persistence, is there something we can rely on Flyte here?
p
If none of the built-in config options suit your fancy, I suppose you could save your config as a FlyteFile and have some convenience tasks that read/write to it. It will be more global than most of the aforementioned config options but more native than dealing with S3 directly, if that's what you're currently doing.
m
ah this config deals more like business-logic related to the code in the workflow, not from the infrastrcutural (compile/registration)
i see
FlyteFile
: the problem with this is for task code to access this config of FlyteFile, the code have to explicitly propagate this object to all the subsequent tasks right?
p
Oh I see, you want to load like a k8s ConfigMap or something to control the underlying infra?
m
ah nope.. explicitly not infra configration.. actually we can address our need if we are diligent about writing our Naive solution 1
Copy code
@task
def foo(arg1, arg2, config)

@task
def bar(arg1, arg2, config)

@workflow
def wf()
  config = get_conf()
  foo(..., config)
  bar(..., config)
p
Yes I believe you'd have to explicitly pass those values unwrapped from the FlyteFile to every task that needs access
m
what i'm wondering is.. Naive solutions 2:
Copy code
@custom_workflow
def foo(your, input, arg, config_to_serialize):
  your_normal_code(your, input, arg)

def custom_workflow(...)
  # serialize the config object to S3
  # and then dispatch the remaining code 

@task
def your_normal_code(your, input, arg)
  # note config object isn't explicitly passed as arg,
  ec = access_config("executor_count")
p
Yep that would work. There's also a fair bit of config you can access within the decorator e.g. @task(container_image=my_image:latest)
Sorry my comment was meant for the above, lemme catch up haha
m
yah i'd like my config to be not coupled with task, but dynamic at workflow level.. hence the
config_to_serializ
in my Naive solution 2
We are thinking of this as "application-level config" dealing with business logic of the workflow
p
Gotcha - I actually just implemented this but haven't gotten as far as making it hot-swappable (kinda important for config files!).. one sec
So I have a
config.yaml
in the root of my workflow package and in the root
__init__.py
I have:
Copy code
import yaml
from pathlib import Path

ROOT_PATH = Path(__file__).resolve().parent
CONFIG_PATH = ROOT_PATH.joinpath('config.yaml')

with open(CONFIG_PATH, 'r') as f:
    config = yaml.load(f, Loader=yaml.FullLoader)
The in all my modules I can access it with:
Copy code
from run import config

@dynamic(
    container_image=config['current_image'],
    )
def process_samples(indir: FlyteDirectory, regs: List[str]) -> FlyteFile:
    ...
So as it stands, there's just a default config that gets baked into the image. When Flyte registers workflows/tasks it will inject that whole package without needing to change the image itself. Would changing config at registration time be granular enough for you?
If you really needed to I suppose you could write a little convenience function that you can drop into tasks that need the functionality to pull a config from somewhere and overwrite the existing one before the rest of task execution
m
Thanks so much Pryce for sharing your use case!
If you really needed to I suppose you could write a little convenience function that you can drop into tasks that need the functionality to pull a config from somewhere
Yah we do want config decoupled from the image.. (These config might specify how we want to configure resouce setting for some methods in children tasks, and we want to execute the same workflow with different configs)
Thanks for engaging with my question!
p
My pleasure! I'm still figuring this stuff out as I go so I'm sure there are better ways of accomplishing this stuff. Hey if you wouldn't mind doing me a favor and shooting me a message when you settle on a solution, I'm sure I'll need to do something similar down the line 🙂