Anyone know a good way to do non grid eg some kind of bayesi Flyte #announcements

Anyone know a good way to do non-grid (eg. some ki...

elegant-petabyte-32634

06/29/2022, 11:06 AM

Anyone know a good way to do non-grid (eg. some kind of bayesian) hyperparameter optimization with Flyte, with multiple trials in parallel? (ie. is there some library that makes this easy or do I have to implement most of the optimization stuff myself? eg. a library which spits out parameters to try and I give it back the results would be pretty easy to use with Flyte, rather than the library calling an objective function like hyperopt does)

broad-monitor-993

06/29/2022, 1:30 PM

hi @elegant-petabyte-32634 there’s currently no canonical way of doing this, although I believe it’s technically possible with Flyte +

<some bayes opt library>

. One would have to use dynamic workflows to collect results and feed it back to the bayesopt sampler for subsequent trials. Do you have a bayesopt library in mind?

elegant-petabyte-32634

06/29/2022, 1:31 PM

i dont have a specific one in mind, i used hyperopt before though is there a way to limit concurrency with dynamic? // i guess that automatically happens since the sampling is sequential

broad-monitor-993

06/29/2022, 1:59 PM

Using BayesianOptimization (basically using the suggest-evaluate-register loop in the advanced guide) something like this might work.

Copy code

from bayes_opt import BayesianOptimization, UtilityFunction

@task
def black_box_function(points: Dict):
    ...  # inner training loop here

@task
def suggest_points(
    optimizer: BayesianOptimization,
    utility: UtilityFunction,
    concurrency: int,
) -> List[Dict]:
    return [optimizer.suggest(utility) for _ in range(concurrency)]

@task
def register_targets(
    optimizer: BayesianOptimization,
    points: List[Dict],
    targets: List[float],
) -> BayesianOptimization:
    for point, target in zip(points, targets):
        optimizer.register(params=point, target=target)
    return optimizer

@dynamic
def concurrent_trials(points: dict) -> List[float]:
    targets = []
    for _ in points:
        targets.append(black_box_function(**points))
    return targets

@dynamic
def bayesopt(n_iter: int = 5, concurrency: int = 3) -> Dict:
    optimizer = BayesianOptimization(...)
    utility = UtilityFunction(kind="ucb", kappa=2.5, xi=0.0)
    for _ in range(n_iter):
        points = suggest_points(optimizer=optimizer, utility=utility, concurrency=concurrency)
        targets = concurrent_trials(points=points)
        optimizer = register_targets(optimizer, points=points, targets=targets)
    # return point that maximized the target
    return optimizer.max

broad-monitor-993

06/29/2022, 2:00 PM

caveat: this extensively uses the

PythonPickle

type for types that Flyte doesn’t know how to natively handle, like

BayesianOptimization

and

UtilityFunction

types

elegant-petabyte-32634

06/29/2022, 2:00 PM

oh great, that's really helpful Niels thank you 🙂

broad-monitor-993

06/29/2022, 2:01 PM

also, I’m not entirely sure whether the

optimizer = register_targets(optimizer, points=points, targets=targets)

line in

bayesopt

dynamic will work as intended… I do believe this will unroll the dynamic graph correctly, but will have to confirm in practice

elegant-petabyte-32634

06/29/2022, 2:02 PM

ya I'll try it out!

broad-monitor-993

06/29/2022, 2:04 PM

great ! please let me know if this works, would love to work in a canonical example in our tutorials.

👍 1

broad-monitor-993

06/29/2022, 2:07 PM

also happy to help debug if you can share a minimally-repo-example

broad-monitor-993

06/29/2022, 3:04 PM

also as an FYI we’re working on a Flyte-Ray integration, so when that happens RayTune will open up to Flyte users. however, I do think it’s still worth it to explore using Flyte exclusively for hyperparam optimization use cases

broad-monitor-993

06/29/2022, 3:39 PM

Hey @elegant-petabyte-32634 I got excited to try it out myself, so here’s a working example 🙂

Copy code

pip install flytekit bayesian-optimization scipy==1.7.0

need to install specific version of scipy, as

1.8.0

causes issues

broad-monitor-993

06/29/2022, 3:39 PM

bayesopt_py.py

🔥 2

broad-monitor-993

06/29/2022, 3:54 PM

this works locally ^^ testing on a demo cluster now

elegant-petabyte-32634

06/30/2022, 10:19 AM

cool, got it working too!

🦜 1

broad-monitor-993

06/30/2022, 2:07 PM

great @elegant-petabyte-32634! let me know if you have any other questions on this front… would love to know how this works out for your use case

elegant-petabyte-32634

06/30/2022, 2:09 PM

yea will do! one small issue right now: if the trainings take different amounts of time, we're always waiting for all of them to complete (vs there always being N workers up that just fetch more work when theyre done)

broad-monitor-993

06/30/2022, 7:43 PM

Right, that’s definitely a limitation of this approach.

there always being N workers up that just fetch more work when theyre done

The pure Flyte execution model won’t allow for this, hence the integrations with Spark (and Ray, [coming soon]). With those, you can just wrap everything in a single

@task

and use the underlying Spark/Ray cluster to distribute the computation, while having access to all of the state in the hyperopt routine.

if the trainings take different amounts of time

What’s the min, max, and mean runtime of each trial in your case? i.e. are they in the order of minutes, hours, days, (or 😖 weeks)? The benefit of the pure Flyte approach is all trials are subject to Flyte’s data lineage tracking, cache-able with (

@task(cache=True, …)

) and recoverable under the Flyte system.

broad-monitor-993

06/30/2022, 7:52 PM

@freezing-airport-6809 perhaps

@eager

would help in this case, where there’s a central eager workflow that asynchronously spins up

workers at any given time per trial, and when

trials complete the hyperparam sampler updates and samples

parameters and spins up another trial task.

freezing-airport-6809

06/30/2022, 10:07 PM

yes

@eager

when we build it should allow for this. @delightful-greece-6207 is actually working on a prototype in his company

freezing-airport-6809

06/30/2022, 10:07 PM

cc @elegant-petabyte-32634 / @delightful-greece-6207 maybe you folks can get together to try it out?

delightful-greece-6207

07/01/2022, 9:19 AM

Hi, I'd be happy to have a chat about this. We adopted an approach in which a central "master"-task starts, monitors and terminates trial-workflows as necessary. We use optuna as our hyperopt library but I imagine other choices would work just as well.

elegant-petabyte-32634

07/01/2022, 9:38 AM

@broad-monitor-993 the model i was training right now takes about an hour (plus minus a couple minutes), i fixed the number of epochs (keeping only the best tho) so they took all around the same time, but making number of epochs another hyperparameter would be nice, and some of our other models take days to train the caching is indeed very cool, i liked that i could train for 5 hyperparameter optimization iterations, then come back later and see how it did to maybe do more iterations while keeping the progress of the first 5 eager sounds interesting, is that already on a branch somewhere?

broad-monitor-993

07/01/2022, 1:25 PM

Awesome @delightful-greece-6207! Will you be free some time next week (Tue or after)? Would love to learn the approach you describe. @elegant-petabyte-32634 would you be interested in joining? We don’t have an implementation of

eager

yet, but it sounds like @delightful-greece-6207’s solution is an early shot at something like it.

elegant-petabyte-32634

07/01/2022, 1:26 PM

yea sure! although I probably won't be working much on this, have a lot of other stuff on my plate that has higher prio unfortunately

broad-monitor-993

07/01/2022, 1:31 PM

@delightful-greece-6207 @elegant-petabyte-32634 what time zones are y’all at? Does Tue 7/5 11AM EST work for you?

elegant-petabyte-32634

07/01/2022, 1:33 PM

works for me unless something at my job comes up

delightful-greece-6207

07/01/2022, 1:33 PM

that should work for me as well

elegant-petabyte-32634

07/01/2022, 1:33 PM

im in the UK, but probably closer to being on US-east timezone 😄

broad-monitor-993

07/01/2022, 1:41 PM

great! just sent invite

freezing-airport-6809

07/01/2022, 1:58 PM

Nice

broad-monitor-993

07/05/2022, 3:02 PM

hey @delightful-greece-6207 friendly ping: https://meet.google.com/tne-dhjf-nfs

broad-monitor-993

07/05/2022, 4:00 PM

oh, btw @delightful-greece-6207 I forgot to ask: was there a particular reason y’all didn’t decide trying out RayTune for hyperopt?

176 Views

Open in Slack

Previous Next