How do you guys typically get your data sets? Is i...
# ask-the-community
m
How do you guys typically get your data sets? Is it mostly just queries from existing RDBMS?
k
Hi Mike, thank you for joining. Folks get dataset from data warehouses, blob stores and rdbms
m
That is what I figured. So do you think most folks here typically make it themselves or does like a DBA/Data Engineer typically set them up?
I would like to see if the schema or general data drift tends to be a problem. I could see a single column name change throwing off the whole task
k
That is true, but it will fail in flytes case
If you model it well
m
Is there a way to model it in such a way that a change to a column name or data type won’t cause a breakage?
k
It will cause a break, if you use typed columns, but that is optional
All Types in structured data set a data frames are optional
Cc @Yee
m
Right, but I might be wrong so please correct me. If you don’t type it, how do you perform things like math operations? It would just default to a string, right?
y
hop on a call? @Mike Carley
m
Maybe. I don’t think its really worth your time. I am just trying to figure out how MLOps platforms like this one cope with data supplies that change their stuff up all the time, and without telling anyone
y
i don’t mind if you don’t mind @Mike Carley
sorry got pulled away to investigate another bug
would you have some time later this week for 15-20 mins?
k
FWIW there's an existing Flyte Integration with Great Expectations https://docs.flyte.org/projects/cookbook/en/stable/auto/integrations/flytekit_plugins/greatexpectations/index.html having been in both the data engineering and data science realms, its been hard to set up a working contract between source and sink. Great Expectations is a great project working on getting tests for that. I think the power of Flyte is in that tasks are statically typed in that you won't accidentally allow the workflow to go through. I think the proper behavior for the workflow is to fail when data changes
m
Its all good @Yee. I have some time tomorrow. Does that work for you?
y
yeah sure, what tz are you in?
meetings here and there but pretty open for most of the day
m
@Niels Bantilan
156 Views