Hi everyone! I was wondering if there is a good wa...
# ask-the-community
t
Hi everyone! I was wondering if there is a good way to easily run multiple workflows in parallel locally, something like
Copy code
if __name__ == "__main__":
    with Pool(4) as p:
        models = p.map(training_workflow, [{"C": 0.1}, {"C": 0.2}, {"C": 0.3}, {"C": 0.4}])
for the workflow in the getting started tutorial https://docs.flyte.org/projects/cookbook/en/latest/index.html This does not work because apparently the workflows don't play nice with pickling
Copy code
_pickle.PicklingError: Can't pickle <function training_workflow at 0x7fccd751eb60>: it's not the same object as __main__.training_workflow
I would like to do this because I'm writing a CLI tool that does some data manipulation on a bunch of files using flexible pipelines and flyte seems like a very nice way of defining these, with the type checking between tasks etc. Eventually these pipelines should also run remotely, but for now my focus is still on running things locally. I would also like to avoid requiring the users of the tool to spin up a demo flyte cluster. Maybe this is a bad idea and flyte is the wrong tool for this job, but if that's the case feel free to tell me also.
j
You can just submit the workflows, if you dont set the
wait
on execution it wont hang. So you can just submit all your work in a loop and then monitor the execution object at the end
t
Thanks! Sorry if I'm misunderstanding, but that would only work with FlyteRemote execute right? I would like to be able to do this locally, without running on a cluster
j
ohh right 🤦 didnt read carefully enough. hmm im sure if that is possible i havent found a way yet. Do you think you can wrap the workflow in another python function and parallelize that? they are all standalone workflows right
t
No worries, thanks for taking the time to help! For now I would like to have one workflow and run it on multiple inputs in parallel locally. I could wrap it in a function and try it like that, but I think non-top level functions aren't pickleable either, that's an issue I've run into before. I'll give it a try though!
Copy code
ValueError: TaskFunction cannot be a nested/inner or local function. It should 
be accessible at a module level for Flyte to execute it. Test modules with names
beginning with `test_` are allowed to have nested tasks. If you're decorating 
your task function with custom decorators, use functools.wraps or 
functools.update_wrapper on the function wrapper. Alternatively if you want to 
create your own tasks with custom behavior use the TaskResolverMixin
Yeah, can't have them inside other functions sadly
j
oh i also remember you can also reference the underlying function not sure it would work for workflows
i guess then its not really a workflow but just python method, you might be able to access it like
._workflow_function
t
That does give me access to the underlying function! Just as a small test I tried pickling it directly, but that gives me the same error as the first attempt
j
hmm then it might not be a flyte issue, i dont use Pool and map so not sure whats the issue there, maybe joblib might have better interface
t
Yeah I'll try some different multiprocessing libraries, I think the default one just has trouble pickling things sometimes and there are alternatives which do better
k
You can shell Execute pyflyte run, instead of multi processing
And you are good to go
j
would it return a execution object to wait on or will it just submit them
k
This is local
It will hang till completion