Hello, I'm in computational biology. I previously ...
# flyte-support
l
Hello, I'm in computational biology. I previously used Luigi with their AWS batch plugin (and modified that plugin extensively to expand it's capability) for my work. I'm at a new company now starting from scratch and Luigi is a mostly dead project so I'm looking to switch to Flyte and trying to figure out what makes sense architecture wise. The jobs will be similar to what I ran with Lugi/AWS batch previously. I will be running computational biology/bioinformatics workflows. They have widely varying time and resource requirements. As an example a worflow might have 1 step that requires 20 large compute nodes running 1 job per node for 24 hours. Another step might want to run 1000 using 1cpu/4gb memory and run each one for 10 minutes. But then when a pipeline isn't running I need 0 compute. Does flyte make sense for this use case and if so does it make sense to farm the jobs out to flyte kubernets or to flyte AWS batch jobs? I'm fairly sure it could be done either way, I'm just not sure if there are any advantages/disadvantages to doing it a given way.
h
@late-author-46106, welcome to the community! Based on what you described, Flyte is a good fit. As you might have already noticed, Flyte also has support for running tasks on AWS batch. Leaning on the declarative power provided by the sdk, from a client's perpsective, kicking off tasks on aws batch looks very simple. You'll need to configure Flyte to know about your aws account, but that shouldn't be too hard. You mentioned that when pipelines are not running you don't need compute, but you'll have to keep in mind that Flyte's components (esp. the control plane and the data plane) will consume some resources, not a lot. Here's a component architecture diagram.
👍 1
l
Thanks for the advice. I'm working on getting a proof of concept up and running.
❤️ 1