question about nested/repeated fields for structur...
# flytekit
d
question about nested/repeated fields for structured datasets. They don't appear to be supported. is this a backend limitation or just one that exists in flytekit?
k
d
might just be a flytekit side, do you have a code example of defining this. trying using named tuples, kwtypes etc for nested to no avail
or should it be another structured dataset?
hang on let me share some code 🙂
none of the following work with flytekit
Copy code
# Levels = Annotated[StructuredDataset, kwtypes(level1=str, level2=str)] # TypeError: unhashable type: 'collections.OrderedDict'
# Levels = NamedTuple( # AssertionError: type <class 'onemodel.models.levels'> is currently not supported by StructuredDataset
#     "levels",
#     level1=str,
#     level2=str
# )
Levels = kwtypes(  # TypeError: unhashable type: 'collections.OrderedDict'
    level1=str,
    level2=str
)
Schema = Annotated[StructuredDataset, kwtypes(age=int, levels=Levels)]

@task
def mytask() -> Schema:
    return pd.DataFrame({
        "age": [1],
        "levels": [
            {"level1": "1", "level2": "2"}
        ]
    })
s
currently, we provide support for simple types, but it should be possible to support nested types. cc @Kevin Su
k
it’s a known issue. it’s doable, just no one implements it yet. more and more dataframe/database support nested type now, so I think it’s make sense to add that.
d
seems straightforward. what's the best way to get this rolling? open an issue?
also is this just a root xformer fix or would we need to update each coder?
@Kevin Su thoughts on the above?
k
@Dylan Wilder I will escalate this issue. https://github.com/flyteorg/flyte/issues/4241 you already created one before.
which structured dataset decoder you are using?
pandasToParquetDecoder
? Let us know, we can work on that first.
d
pandas bq 🙂
k
got it, we will work on that first.
d
Thanks!!
k
cc @Austin Liu (Austinnn) ^^^
a
@Kevin Su Got it!