Hi quick question Id like to populate a variable that is use Flyte #flyte-support

Hi quick question, Id like to populate a variable ...

shy-evening-51366

07/04/2023, 11:50 AM

Hi quick question, Id like to populate a variable that is used as environment variable in a task from a configuration input; something like:

Copy code

FOO = "some_default_value"

@task(environment={"FOO": FOO})
def some_task():
    return "bar"

def populate_vars_from_config(config):
    print(config.keys())
    FOO = config['foo']

@workflow
def some_workflow(config: dict):
    populate_vars_from_config()
    some_task()

And run it with

Copy code

pyflyte run ./some_path some_workflow --config ./config.json

However that gives me an exception because its a Promise:

Copy code

Failed with Unknown Exception <class 'AttributeError'> Reason: Error encountered while executing 'some_workflow':
  'Promise' object has no attribute 'keys'

Any suggestions how to approach this in a better way? 🙂 I guess I could just set them myself within the task, but would be nice to have them visible in the decorator.

freezing-boots-56761

07/04/2023, 2:07 PM

Task environment variables here are set at registration time, not runtime. Could you just pass the configuration as an input, as opposed to environment variable, to the second task instead? An alternative method to achieve what you are doing here might be to use a dynamic task, and nest some_task inside that dynamic.

shy-evening-51366

07/04/2023, 2:30 PM

Im doing that now, but unfortunately (I think?) I can only pass full objects, not do something like:

Copy code

@workflow
def some_workflow(config: dict):
    validated_config = validate_config(config=config)
    some_task = task(
        some_specific_param=validated_config["some_specific_param"],
    )

Which makes it harder to do testing, since you’re creating dependencies on an entire config object in the task function, rather than only what is specifically used (the

some_specific_param

parameter).

shy-evening-51366

07/04/2023, 2:40 PM

Hmm looking into the dynamic task now, maybe that is something I can use 👍

👍 1

freezing-boots-56761

07/04/2023, 3:02 PM

it wont work with a dict. but I believe there is a feature coming that will let this work cleanly with dataclasses. @thankful-minister-83577: cc

rich-artist-45699

07/04/2023, 8:57 PM

Copy code

@dataclass_json
@dataclass
class Config:
    ...

@workflow
def some_workflow(config: Config):
    ...

This should work fine, no?

shy-evening-51366

07/05/2023, 9:47 AM

No, that still gives you an

AttributeError

if you try to access a field of the Config class inside the

some_workflow

function, because it’s a Promise. Using a

dynamic

works for me.

thankful-minister-83577

07/05/2023, 4:21 PM

yeah that doesn’t work right now. we’re thinking of adding that feature though

👍 1

thankful-minister-83577

07/05/2023, 4:44 PM

when you use the output of a task as the input to another task, you’re creating an output reference. output references currently can only refer to an entire output. it can’t index into a list output for index into a map. when we do this, we’d also have to support the possibility of indexing of potentially deeply nested things all the way down. list is harder because it’s harder to control out of bounds errors, and also out of bounds errors would be moved from the user side into propeller, so list indexing might not happen. keys are easier though. and being able to bind output references to keys also mostly deprecates the need for the not great namedtuple ux

120 Views

Open in Slack

Previous Next