Thread
#flytekit
    Varun Kulkarni

    Varun Kulkarni

    4 weeks ago
    👋 is it possible to create an annotated
    FlyteSchema
    object that also includes metadata on whether a particular column is nullable vs required, along with its type?
    Yee

    Yee

    4 weeks ago
    can you take a look at the structured dataset object instead? structured datasets support a broader set of types.
    Varun Kulkarni

    Varun Kulkarni

    4 weeks ago
    thanks for the response - can you share an example of how i can annotate a structured dataset to also include info on whether null values are allowed? dont see anything on this topic in the documentation
    i guess the example uses
    kwtypes
    , which upon introspecting the code looks like it is relying on Python's native type system so a different phrasing of this question is: do Structured Datasets / FlyteSchemas allow for
    Optional
    /
    Union
    type annotations in their column definitions?
    Dylan Wilder

    Dylan Wilder

    4 weeks ago
    alt: py 3.9 supports js style unions
    field: str | None
    Yee

    Yee

    4 weeks ago
    you should be able to use optional types in structured dataset columns yes.
    though not for flyteschema - the column types there are much more limited.
    Kevin Su

    Kevin Su

    4 weeks ago
    @Varun Kulkarni is this what you want?
    cols = kwtypes(Name=str, Age=typing.Optional[int], Height=int)
    
    
    @task
    def get_df(a: int) -> Annotated[pd.DataFrame, cols]:
        return pd.DataFrame({"Name": ["Tom", "Joseph"], "Adg": [a, None], "Height": [160, 178]})
    For now, the column is also nullable even if you don’t use
    typing.Optional
    Varun Kulkarni

    Varun Kulkarni

    4 weeks ago
    yeah thats what im looking for! for now we're not relying on flyte to perform the actual enforcement of types - our application can handle that after serialization / deserialization - but it would be great if in the future we can have some sort of guarantee that e.g. a structured dataset does not contain any nulls after deserialization if the column isnt optional
    Kevin Su

    Kevin Su

    4 weeks ago
    Have you try Pandera plugin? it provides a flexible and expressive interface for defining schemas for tabular data. https://docs.flyte.org/projects/cookbook/en/latest/auto/integrations/flytekit_plugins/pandera_examples/index.html