<#2720 [Core feature] Add Outputs() as the idiom f...
# flytekit
c
#2720 [Core feature] Add Outputs() as the idiom for multiple outputs, to avoid user confusion on NamedTuple-vs-dataclass Issue created by jdanbrown ### Motivation: Why do you think this is important? (Copying out a feature request from slack, hopefully I captured enough context here) I now understand the (important) distinction between dataclasses and NamedTuple in flytekit: • NamedTuple : output :: kwargs : input — this makes a lot of sense, because you want multiple outputs, with names • NamedTuple cannot be used as a datatype, only as a return type to mean "multiple outputs" • dataclass is a normal datatype that you can use wherever you like. it doesn't mean "multiple outputs" What's confusing is the way a python user declares a datatype using
@dataclass
or
NamedTuple
is basically the same: @dataclass class Config: epochs: int cv_splits: int ... class Config(NamedTuple): epochs: int cv_splits: int ... So it's a very easy trap to think they're basically interchangeable and then get confused and/or frustrated when flyte behaves in very different ways depending on which one you used. My team is trying to anticipate adding the rest of our team as flyte users (~8 people), as well as handfuls more teams (~5–10 teams) as users, and we think this is an important friction to get ahead of and have a simple recommendation and happy path for. ### Goal: What should the final outcome look like, ideally? Library pseudocode • Define this once • Document/explain to users to use Outputs instead of NamedTuple for task outputs # Outputs is like NamedTuple except: # - It fills in the type name for you -- it's a nuisance parameter, and flyte ignores it # - You use it inline instead of inheriting from it, for both type and value usages # - TODO Add metaclass stuff to make this code actually work as a type (and a value) Outputs = lambda **kwargs: NamedTuple("Outputs", **kwargs) Example user code: from wherever import Outputs @dataclass def Config: ... @dataclass def TrainStats: ... @task def evaluate_model( config: Config, # A user-defined dataclass model: tf.keras.Model, # Some type from a library metrics: List[str], # A normal python datatype ) -> Outputs( # Use Outputs() inline as a type instead of declaring a NamedTuple success: bool, # A normal python datatype stats: TrainStats, # A user-defined dataclass thresholds: np.ndarray # Some type from a library ): ... return Outputs( # Also use Outputs() as a value, matching the type above success=..., stats=..., thresholds=..., ) # Simple tasks can ofc still return single outputs too # - With no name, i.e. flyte's default o1 naming @task def sample_train_data(X: pd.DataFrame) -> pd.DataFrame: ... ### Describe alternatives you've considered . ### Propose: Link/Inline OR Additional context No response ### Are you sure this issue hasn't been raised already? • Yes ### Have you read the Code of Conduct? • Yes flyteorg/flyte