Padma Priya M
11/29/2022, 2:57 PMfrom flytekitplugins.ray import HeadNodeConfig, RayJobConfig, WorkerNodeConfig
import ray
from ray import tune
#ray.init()
#ray.init("auto", ignore_reinit_error=True)
ray_config = RayJobConfig(
head_node_config=HeadNodeConfig(ray_start_params={"log-color": "True"}),
worker_node_config=[WorkerNodeConfig(group_name="ray-group", replicas=2)],
)
num_actors = 4
num_cpus_per_actor = 1
ray_params = RayParams(
num_actors=num_actors, cpus_per_actor=num_cpus_per_actor)
def train_model(config):
train_x, train_y = load_breast_cancer(return_X_y=True)
train_set = RayDMatrix(train_x, train_y)
evals_result = {}
bst = train(
params=config,
dtrain=train_set,
evals_result=evals_result,
evals=[(train_set, "train")],
verbose_eval=False,
ray_params=ray_params)
bst.save_model("model.xgb")
@task(task_config=ray_config, limits=Resources(mem="2000Mi", cpu="1"))
def train_model_task() -> dict:
config = {
"tree_method": "approx",
"objective": "binary:logistic",
"eval_metric": ["logloss", "error"],
"eta": tune.loguniform(1e-4, 1e-1),
"subsample": tune.uniform(0.5, 1.0),
"max_depth": tune.randint(1, 9)
}
analysis = tune.run(
train_model,
config=config,
metric="train-error",
mode="min",
num_samples=4,
resources_per_trial=ray_params.get_tune_resources())
return analysis.best_config
@workflow
def train_model_wf() -> dict:
return train_model_task()
Ketan (kumare3)
Samhita Alla
@task(task_config=ray_config, limits=Resources(mem="2000Mi", cpu="1", ephemeral_storage="500Mi"))
Padma Priya M
12/01/2022, 10:27 AMfrom sklearn.datasets import load_breast_cancer
from flytekit import Resources, task, workflow
from flytekitplugins.ray import HeadNodeConfig, RayJobConfig, WorkerNodeConfig
import ray
from ray import tune
#ray.shutdown()
#ray.init()
#ray.init("auto", ignore_reinit_error=True)
ray_config = RayJobConfig(
head_node_config=HeadNodeConfig(ray_start_params={"log-color": "True"}),
worker_node_config=[WorkerNodeConfig(group_name="ray-group", replicas=3)],
)
num_actors = 2
num_cpus_per_actor = 1
ray_params = RayParams(
num_actors=num_actors, cpus_per_actor=num_cpus_per_actor)
def train_model(config):
train_x, train_y = load_breast_cancer(return_X_y=True)
train_set = RayDMatrix(train_x, train_y)
evals_result = {}
bst = train(
params=config,
dtrain=train_set,
evals_result=evals_result,
evals=[(train_set, "train")],
verbose_eval=False,
ray_params=ray_params)
bst.save_model("model.xgb")
#@task(limits=Resources(mem="2000Mi", cpu="1"))
@task(task_config=ray_config, limits=Resources(mem="3000Mi", cpu="1", ephemeral_storage="3000Mi"))
def train_model_task() -> dict:
config = {
"tree_method": "approx",
"objective": "binary:logistic",
"eval_metric": ["logloss", "error"],
"eta": tune.loguniform(1e-4, 1e-1),
"subsample": tune.uniform(0.5, 1.0),
"max_depth": tune.randint(1, 9)
}
analysis = tune.run(
train_model,
config=config,
metric="train-error",
mode="min",
num_samples=4,
max_concurrent_trials=1,
resources_per_trial=ray_params.get_tune_resources())
return analysis.best_config
@workflow
def train_model_wf() -> dict:
return train_model_task()
Still getting this error when we specify ephemeral_storage
value also. do u have any suggested limit for cpu and memorySamhita Alla
Padma Priya M
12/01/2022, 1:38 PMSamhita Alla
kubectl -n flyte edit cm flyte-admin-base-config
is the command but I’m not very sure. Let me know if this doesn’t work.Padma Priya M
12/01/2022, 1:47 PMSamhita Alla
Padma Priya M
12/02/2022, 4:46 AMSamhita Alla
Padma Priya M
12/02/2022, 5:05 AM@task(task_config=ray_config, limits=Resources(mem="5000Mi", cpu="1", ephemeral_storage="3000Mi"))
Samhita Alla
get_tune_resources()
.Padma Priya M
12/05/2022, 10:31 AMray_config = RayJobConfig(
head_node_config=HeadNodeConfig(ray_start_params={"log-color": "True"}),
worker_node_config=[WorkerNodeConfig(group_name="ray-group", replicas=3)],
)
do we have any ways to specify the number of cpus in the ray cluster config? like this ?
ray_config = RayJobConfig(
head_node_config=HeadNodeConfig(ray_start_params={"log-color": "True"}),
worker_node_config=[WorkerNodeConfig(group_name="ray-group", replicas=3)],
num_cpus=4,
)
Samhita Alla
RayParams
Padma Priya M
12/05/2022, 1:31 PMcpus_per_actor
. Is there any config to be changed to increase the cpu of the ray cluster as a whole? bcause when I increased the cpus_per_actor also the requested cpu is still 2 and shows the warning that it has only 2 cpu in the cluster.Samhita Alla
init()
, but in this case, since Flyte initializes the cluster, how can a user modify those values?Kevin Su
12/05/2022, 6:50 PMlimit
and request
in the @task. Like https://github.com/flyteorg/flytesnacks/blob/a3b97943563cfc952b5683525763578685a93[…]694/cookbook/integrations/kubernetes/ray_example/ray_example.pyPadma Priya M
12/06/2022, 4:31 AM@task(task_config=ray_config, requests=Resources(mem="5000Mi", cpu="5", ephemeral_storage="1000Mi"), limits=Resources(mem="7000Mi", cpu="9", ephemeral_storage="2000Mi"))
I have requested for 5 cpus but when it executes it shows requested cpus as 2 only.Samhita Alla
limits
?Padma Priya M
12/06/2022, 4:48 AMSamhita Alla
requests
isn’t assigning the requested resources to the cluster.Padma Priya M
12/06/2022, 1:59 PMSamhita Alla
Kevin Su
12/10/2022, 3:00 AMPadma Priya M
12/12/2022, 5:12 AM@task(task_config=ray_config, requests=Resources(mem="5000Mi", cpu="5") , limits=Resources(mem="7000Mi", cpu="9"))
This is the requested resources.Kevin Su
12/12/2022, 8:22 AMPadma Priya M
12/12/2022, 9:25 AMKevin Su
12/13/2022, 8:45 PM