Many thanks to <@U01UBDC4E1L> and the Spotify team...
# announcements
Many thanks to @Guillaume Perchais and the Spotify team for publishing this wonderful blog about why they made the switch to Flyte: Why We Switched Our Data Orchestration Service Much appreciated and always a pleasure to work with you! Flyte Team
🔥 8
flyte 2 4
❤️ 14
We can't thank you enough for helping us in this journey ! 🙏
Very nice post thank you 🙇 I was wondering about that part “Uses Task as a first-class citizen, making it easy for engineers to share and reuse tasks/workflows.“, what exactly does that mean in your case? Can I write a python workflow and reference an already registered task for example?
Exactly @Stephen Before Flyte we shipped infrastructure related functionality through libraries. At our scale (20k+ wfs and 1000+ repos) libraries create a whole set of issues, especially related to maintenance and backwards compatibility. It's difficult for developers to keep up with all the libraries and (latest) versions, and it's also difficult for us, as Platform providers because we need to replicate functionality between languages (python/java) and our systems need to be backwards compatible on multiple versions (among others). With Flyte, we build common infrastructure functionality in tasks and/or launch plans and ship them to the users through references. Instead of shipping code in libraries, we ship a very thin reference to the project/task we published. This is sort of revolutionary for us. Our users repos shrank. Time spent on maintenance reduced and it's easier to keep everyone on the latest/greatest functionality. We don't need to worry about conflicts with pip/mvn/sbt, as long as the task works and the interface remains the same, our users can use it. Examples include, interacting with various metadata services, reading or pushing data to buckets/tables, configuring test run etc.
Thanks a lot @Babis Kiosidis for your very detailed answer! 🙇 It’s very cool to think about it that way, I have to admit that we were thinking on building libraries that our users could then use to perform tasks that are similar but it doesn’t really scale once you have thousands of repos and workflows. Have you already made a presentation about that on the OSS sync up already? I’d be very interested in learning more about that as it’s something that we’ll have to deal with soon-ish I guess
I can't remember if we discussed about this specific design in the OSS sync before. We could probably arrange something if we haven't already
so @Babis Kiosidis / @Guillaume Perchais I think we are trying to see if one of you is going to talk at an oss sync in April. cc @Stephen
❤️ 2
Also @Stephen sad that you do not know about this feature - Reference Tasks
We learned about it this morning actually when we tried to figure out how we could do that in the future.
Would be amazing to get some real-world insights into how you use reference tasks @Babis Kiosidis, lessons learnt, what to do/what not to do, etc
👍 2
is there a diagram depicting the relationship between docker, common infra/platform lib, user codes, and thin reference
still not clear how you handle infra/platform upgrade
ohh I understand the question now - but let me explain how the data flows - this doc should explain
for the reference code, you are right, we do not have a diagram
we should write one
thank you @Joseph Wang for the suggestion
I see a lot of interest in this Maybe, we can do a deep dive in one of the community calls as well
We don't have a diagram to share sorry. What we do is we ship a very thin library that has the references and uses the fetch to pull the latest version available during the branch build.
@Babis Kiosidis and gang lets work on a diagram
Sounds good, let me get back to you on that 👍
ohh man, thank you 🙂