Robin Kahlow
06/29/2022, 11:06 AMNiels Bantilan
06/29/2022, 1:30 PM<some bayes opt library>
.
One would have to use dynamic workflows to collect results and feed it back to the bayesopt sampler for subsequent trials.
Do you have a bayesopt library in mind?Robin Kahlow
06/29/2022, 1:31 PMNiels Bantilan
06/29/2022, 1:59 PMfrom bayes_opt import BayesianOptimization, UtilityFunction
@task
def black_box_function(points: Dict):
... # inner training loop here
@task
def suggest_points(
optimizer: BayesianOptimization,
utility: UtilityFunction,
concurrency: int,
) -> List[Dict]:
return [optimizer.suggest(utility) for _ in range(concurrency)]
@task
def register_targets(
optimizer: BayesianOptimization,
points: List[Dict],
targets: List[float],
) -> BayesianOptimization:
for point, target in zip(points, targets):
optimizer.register(params=point, target=target)
return optimizer
@dynamic
def concurrent_trials(points: dict) -> List[float]:
targets = []
for _ in points:
targets.append(black_box_function(**points))
return targets
@dynamic
def bayesopt(n_iter: int = 5, concurrency: int = 3) -> Dict:
optimizer = BayesianOptimization(...)
utility = UtilityFunction(kind="ucb", kappa=2.5, xi=0.0)
for _ in range(n_iter):
points = suggest_points(optimizer=optimizer, utility=utility, concurrency=concurrency)
targets = concurrent_trials(points=points)
optimizer = register_targets(optimizer, points=points, targets=targets)
# return point that maximized the target
return optimizer.max
PythonPickle
type for types that Flyte doesn’t know how to natively handle, like BayesianOptimization
and UtilityFunction
typesRobin Kahlow
06/29/2022, 2:00 PMNiels Bantilan
06/29/2022, 2:01 PMoptimizer = register_targets(optimizer, points=points, targets=targets)
line in bayesopt
dynamic will work as intended… I do believe this will unroll the dynamic graph correctly, but will have to confirm in practiceRobin Kahlow
06/29/2022, 2:02 PMNiels Bantilan
06/29/2022, 2:04 PMpip install flytekit bayesian-optimization scipy==1.7.0
need to install specific version of scipy, as 1.8.0
causes issuesRobin Kahlow
06/30/2022, 10:19 AMNiels Bantilan
06/30/2022, 2:07 PMRobin Kahlow
06/30/2022, 2:09 PMNiels Bantilan
06/30/2022, 7:43 PMthere always being N workers up that just fetch more work when theyre doneThe pure Flyte execution model won’t allow for this, hence the integrations with Spark (and Ray, [coming soon]). With those, you can just wrap everything in a single
@task
and use the underlying Spark/Ray cluster to distribute the computation, while having access to all of the state in the hyperopt routine.
if the trainings take different amounts of timeWhat’s the min, max, and mean runtime of each trial in your case? i.e. are they in the order of minutes, hours, days, (or 😖 weeks)? The benefit of the pure Flyte approach is all trials are subject to Flyte’s data lineage tracking, cache-able with (
@task(cache=True, …)
) and recoverable under the Flyte system.@eager
would help in this case, where there’s a central eager workflow that asynchronously spins up N
workers at any given time per trial, and when x
trials complete the hyperparam sampler updates and samples x
parameters and spins up another trial task.Ketan (kumare3)
@eager
when we build it should allow for this. @Sebastian Schulze is actually working on a prototype in his companySebastian Schulze
07/01/2022, 9:19 AMRobin Kahlow
07/01/2022, 9:38 AMNiels Bantilan
07/01/2022, 1:25 PMeager
yet, but it sounds like @Sebastian Schulze’s solution is an early shot at something like it.Robin Kahlow
07/01/2022, 1:26 PMNiels Bantilan
07/01/2022, 1:31 PMRobin Kahlow
07/01/2022, 1:33 PMSebastian Schulze
07/01/2022, 1:33 PMRobin Kahlow
07/01/2022, 1:33 PMNiels Bantilan
07/01/2022, 1:41 PMKetan (kumare3)
Niels Bantilan
07/05/2022, 3:02 PM