Thread
#ask-the-community
    m

    Mike Carley

    1 week ago
    How do you guys typically get your data sets? Is it mostly just queries from existing RDBMS?
    Ketan (kumare3)

    Ketan (kumare3)

    1 week ago
    Hi Mike, thank you for joining. Folks get dataset from data warehouses, blob stores and rdbms
    m

    Mike Carley

    1 week ago
    That is what I figured. So do you think most folks here typically make it themselves or does like a DBA/Data Engineer typically set them up?
    I would like to see if the schema or general data drift tends to be a problem. I could see a single column name change throwing off the whole task
    Ketan (kumare3)

    Ketan (kumare3)

    1 week ago
    That is true, but it will fail in flytes case
    If you model it well
    m

    Mike Carley

    1 week ago
    Is there a way to model it in such a way that a change to a column name or data type won’t cause a breakage?
    Ketan (kumare3)

    Ketan (kumare3)

    1 week ago
    It will cause a break, if you use typed columns, but that is optional
    All Types in structured data set a data frames are optional
    Cc @Yee
    m

    Mike Carley

    1 week ago
    Right, but I might be wrong so please correct me. If you don’t type it, how do you perform things like math operations? It would just default to a string, right?
    Yee

    Yee

    1 week ago
    hop on a call? @Mike Carley
    m

    Mike Carley

    1 week ago
    Maybe. I don’t think its really worth your time. I am just trying to figure out how MLOps platforms like this one cope with data supplies that change their stuff up all the time, and without telling anyone
    Yee

    Yee

    1 week ago
    i don’t mind if you don’t mind @Mike Carley
    sorry got pulled away to investigate another bug
    would you have some time later this week for 15-20 mins?
    k

    Katrina P

    1 week ago
    FWIW there's an existing Flyte Integration with Great Expectations https://docs.flyte.org/projects/cookbook/en/stable/auto/integrations/flytekit_plugins/greatexpectations/index.html having been in both the data engineering and data science realms, its been hard to set up a working contract between source and sink. Great Expectations is a great project working on getting tests for that. I think the power of Flyte is in that tasks are statically typed in that you won't accidentally allow the workflow to go through. I think the proper behavior for the workflow is to fail when data changes
    m

    Mike Carley

    1 week ago
    Its all good @Yee. I have some time tomorrow. Does that work for you?
    Yee

    Yee

    1 week ago
    yeah sure, what tz are you in?
    meetings here and there but pretty open for most of the day
    m

    Mike Carley

    4 days ago
    @Niels Bantilan