Terence Kent
01/03/2024, 10:05 PM@reference_task
(or otherwise). We've ended up with many standard tasks, particularly around ETL-style work (e.g. batch capturing data into jsonl
or parquet
). These are currently placed into a dedicated flyte project for re-usable components. So far, so good. However, many of our tasks use NamedTuple
or @dataclass
types for inputs or outputs and this leaves us with a question of how to get those data types defined in the projects that reference these common tasks.
For folks who have already handled this situation. Do you...
A - re-define the data types in each project and update them manually if they ever change.
B - publish a pip package of just the datatypes and import them in every project that uses them.
C - Just avoid custom data types and opt for primitives for shared components
D - Something else?Clemente Cuevas
01/03/2024, 10:10 PMTerence Kent
01/03/2024, 10:11 PMClemente Cuevas
01/03/2024, 10:43 PMJustin Boutwell
01/03/2024, 11:46 PMTerence Kent
01/03/2024, 11:47 PMTerence Kent
01/03/2024, 11:54 PM@reference_task
)? Or, do you do all the code-sharing via those internal libraries without publishing tasks/workflows to a flyte project?Justin Boutwell
01/04/2024, 1:45 PM