<#2720 [Core feature] Add Outputs() as the idiom f...
# flytekit
a
#2720 [Core feature] Add Outputs() as the idiom for multiple outputs, to avoid user confusion on NamedTuple-vs-dataclass Issue created by jdanbrown Motivation: Why do you think this is important? (Copying out a feature request from slack, hopefully I captured enough context here) I now understand the (important) distinction between dataclasses and NamedTuple in flytekit: • NamedTuple : output :: kwargs : input — this makes a lot of sense, because you want multiple outputs, with names • NamedTuple cannot be used as a datatype, only as a return type to mean "multiple outputs" • dataclass is a normal datatype that you can use wherever you like. it doesn't mean "multiple outputs" What's confusing is the way a python user declares a datatype using
@dataclass
or
NamedTuple
is basically the same:
Copy code
@dataclass
class Config:
    epochs: int
    cv_splits: int
    ...

class Config(NamedTuple):
    epochs: int
    cv_splits: int
    ...
So it's a very easy trap to think they're basically interchangeable and then get confused and/or frustrated when flyte behaves in very different ways depending on which one you used. My team is trying to anticipate adding the rest of our team as flyte users (~8 people), as well as handfuls more teams (~5–10 teams) as users, and we think this is an important friction to get ahead of and have a simple recommendation and happy path for. Goal: What should the final outcome look like, ideally? Library pseudocode • Define this once • Document/explain to users to use Outputs instead of NamedTuple for task outputs
Copy code
# Outputs is like NamedTuple except:
#   - It fills in the type name for you -- it's a nuisance parameter, and flyte ignores it
#   - You use it inline instead of inheriting from it, for both type and value usages
#   - TODO Add metaclass stuff to make this code actually work as a type (and a value)
Outputs = lambda **kwargs: NamedTuple("Outputs", **kwargs)
Example user code:
Copy code
from wherever import Outputs

@dataclass
def Config:
    ...

@dataclass
def TrainStats:
    ...

@task
def evaluate_model(
    config: Config,         # A user-defined dataclass
    model: tf.keras.Model,  # Some type from a library
    metrics: List[str],     # A normal python datatype
) -> Outputs(               # Use Outputs() inline as a type instead of declaring a NamedTuple
    success: bool,          # A normal python datatype
    stats: TrainStats,      # A user-defined dataclass
    thresholds: np.ndarray  # Some type from a library
):
    ...
    return Outputs(         # Also use Outputs() as a value, matching the type above
        success=...,
        stats=...,
        thresholds=...,
    )

# Simple tasks can ofc still return single outputs too
#   - With no name, i.e. flyte's default o1 naming
@task
def sample_train_data(X: pd.DataFrame) -> pd.DataFrame:
    ...
Describe alternatives you've considered . Propose: Link/Inline OR Additional context No response Are you sure this issue hasn't been raised already? ☑︎ Yes Have you read the Code of Conduct? ☑︎ Yes flyteorg/flyte