question about nested/repeated fields for structur...
# flytekit
f
question about nested/repeated fields for structured datasets. They don't appear to be supported. is this a backend limitation or just one that exists in flytekit?
f
f
might just be a flytekit side, do you have a code example of defining this. trying using named tuples, kwtypes etc for nested to no avail
or should it be another structured dataset?
hang on let me share some code 🙂
none of the following work with flytekit
Copy code
# Levels = Annotated[StructuredDataset, kwtypes(level1=str, level2=str)] # TypeError: unhashable type: 'collections.OrderedDict'
# Levels = NamedTuple( # AssertionError: type <class 'onemodel.models.levels'> is currently not supported by StructuredDataset
#     "levels",
#     level1=str,
#     level2=str
# )
Levels = kwtypes(  # TypeError: unhashable type: 'collections.OrderedDict'
    level1=str,
    level2=str
)
Schema = Annotated[StructuredDataset, kwtypes(age=int, levels=Levels)]

@task
def mytask() -> Schema:
    return pd.DataFrame({
        "age": [1],
        "levels": [
            {"level1": "1", "level2": "2"}
        ]
    })
t
currently, we provide support for simple types, but it should be possible to support nested types. cc @glamorous-carpet-83516
g
it’s a known issue. it’s doable, just no one implements it yet. more and more dataframe/database support nested type now, so I think it’s make sense to add that.
f
seems straightforward. what's the best way to get this rolling? open an issue?
also is this just a root xformer fix or would we need to update each coder?
@glamorous-carpet-83516 thoughts on the above?
g
@famous-businessperson-24711 I will escalate this issue. https://github.com/flyteorg/flyte/issues/4241 you already created one before.
which structured dataset decoder you are using?
pandasToParquetDecoder
? Let us know, we can work on that first.
f
pandas bq 🙂
g
got it, we will work on that first.
f
Thanks!!
g
cc @thankful-journalist-40373 ^^^
t
@glamorous-carpet-83516 Got it!