adorable-engineer-57446
08/31/2023, 12:01 PMfreezing-airport-6809
freezing-airport-6809
cool-lifeguard-49380
08/31/2023, 2:16 PMfreezing-airport-6809
freezing-airport-6809
adorable-engineer-57446
08/31/2023, 2:30 PMadorable-engineer-57446
08/31/2023, 2:31 PMfreezing-airport-6809
freezing-airport-6809
cool-lifeguard-49380
08/31/2023, 2:36 PMadorable-engineer-57446
08/31/2023, 2:37 PMpyflyte run --remote -p workflow -d dev --service-account default mnist.py horovod_training_wf
cool-lifeguard-49380
08/31/2023, 2:38 PMadorable-engineer-57446
08/31/2023, 2:38 PM--service-account
flag. The service account to use is default
.adorable-engineer-57446
08/31/2023, 2:38 PMcool-lifeguard-49380
08/31/2023, 2:38 PMadorable-engineer-57446
08/31/2023, 2:39 PMcool-lifeguard-49380
08/31/2023, 2:39 PMcool-lifeguard-49380
08/31/2023, 2:40 PMfreezing-airport-6809
cool-lifeguard-49380
08/31/2023, 2:41 PMcool-lifeguard-49380
08/31/2023, 2:41 PMcool-lifeguard-49380
08/31/2023, 2:42 PMSparkApplication
?adorable-engineer-57446
08/31/2023, 2:42 PMpyflyte
adorable-engineer-57446
08/31/2023, 2:43 PMcool-lifeguard-49380
08/31/2023, 2:43 PMkubectl -n <namespace name> get serviceaccount default -o yaml
cool-lifeguard-49380
08/31/2023, 2:43 PMadorable-engineer-57446
08/31/2023, 2:45 PMapiVersion: v1
kind: ServiceAccount
metadata:
annotations:
<http://iam.gke.io/gcp-service-account|iam.gke.io/gcp-service-account>: flyte-workflow-dev-sa@<redacted>
creationTimestamp: "2023-08-03T07:15:09Z"
name: default
namespace: dev
resourceVersion: "676089"
uid: dd7f67ce-2530-4c59-883d-2e08bebeebbb
cool-lifeguard-49380
08/31/2023, 2:45 PMadorable-engineer-57446
08/31/2023, 2:45 PMdefault
service acccont is the one that we want to usecool-lifeguard-49380
08/31/2023, 2:45 PMadorable-engineer-57446
08/31/2023, 2:45 PMadorable-engineer-57446
08/31/2023, 2:45 PMadorable-engineer-57446
08/31/2023, 2:45 PMcool-lifeguard-49380
08/31/2023, 2:45 PMcool-lifeguard-49380
08/31/2023, 2:46 PMcool-lifeguard-49380
08/31/2023, 2:46 PMcool-lifeguard-49380
08/31/2023, 2:46 PMadorable-engineer-57446
08/31/2023, 2:46 PMcool-lifeguard-49380
08/31/2023, 2:46 PMfreezing-airport-6809
cool-lifeguard-49380
08/31/2023, 2:47 PMfreezing-airport-6809
freezing-airport-6809
adorable-engineer-57446
08/31/2023, 2:50 PMadorable-engineer-57446
08/31/2023, 2:50 PM@task(
container_image="europe-west4-docker.pkg.dev/<redacted>/flyte/mpi-mnist:latest",
task_config=MPIJob(
launcher=Launcher(
replicas=1,
),
worker=Worker(
replicas=2,
),
),
retries=3,
requests=Resources(cpu="1", mem="1000Mi"),
limits=Resources(cpu="2", mem="4000Mi")
)
adorable-engineer-57446
08/31/2023, 2:51 PMadorable-engineer-57446
08/31/2023, 2:54 PMfreezing-airport-6809
cool-lifeguard-49380
08/31/2023, 3:02 PMcool-lifeguard-49380
08/31/2023, 3:04 PMcool-lifeguard-49380
08/31/2023, 3:04 PM@Dimss consider trying the v2 controller, which doesn’t depend on ServiceAccounts 🙂
cool-lifeguard-49380
08/31/2023, 3:04 PMadorable-engineer-57446
08/31/2023, 3:08 PMcool-lifeguard-49380
08/31/2023, 3:09 PMadorable-engineer-57446
08/31/2023, 3:10 PM<https://github.com/kubeflow/training-operator>
adorable-engineer-57446
08/31/2023, 3:11 PMadorable-engineer-57446
08/31/2023, 3:11 PMadorable-engineer-57446
08/31/2023, 3:12 PMcool-lifeguard-49380
08/31/2023, 3:13 PMcool-lifeguard-49380
08/31/2023, 3:13 PMfreezing-airport-6809
freezing-airport-6809
freezing-airport-6809
elegant-australia-91422
08/31/2023, 4:11 PMlimited-raincoat-94253
08/31/2023, 4:25 PMcool-lifeguard-49380
08/31/2023, 5:28 PMlimited-raincoat-94253
08/31/2023, 5:38 PMadorable-engineer-57446
09/01/2023, 2:09 PMadorable-engineer-57446
09/01/2023, 2:10 PMkubectl exec
. It then connects to the sidecar container instead to the main container where the worker is locatedfreezing-airport-6809
freezing-airport-6809
adorable-engineer-57446
09/04/2023, 7:56 AMadorable-engineer-57446
09/04/2023, 7:56 AMfreezing-airport-6809
adorable-engineer-57446
09/05/2023, 7:38 AM