Hello, what's the difference between `flytectl dem...
# ask-the-community
a
Hello, what's the difference between
flytectl demo start
and
flytectl sandbox start
? For context, I am NOT using Docker Desktop. I am using Docker Engine + Colima. As such, my docker host is in another path. To change it, I ran the following command as per @Marti Jorda Roca' suggestion:
Copy code
export DOCKER_HOST= .......
flytectl config init
When I ran
flytectl sandbox start
, it worked perfectly fine. But if I ran
flytectl demo start
instead, my workflow failed. Any idea what's happening under the hood? I also tried to use my own docker image when running
flytectl sandbox start
but turns out it did not spin up local registry. Any work around?
y
we really need to replace the old command. you should use
demo start
@jeev should we rename?
sandbox start brings up a version of flyte where all the backend components are deployed separately. we realized a while back that for most production installations that is not necessary, so we combined them.
also the
demo start
ux is better
j
@Albert Wibowo what was the failure in the workflow?
a
Well it just keeps running forever. When I looked at the Kubernetes Logs, the only thing I saw was:
Copy code
tar: Removing leading `/' from member names
From times to times, it will result in a failure and produce sigkill(9) which usually happens because the machine is running out of memory. But, I used the same machine when running
sandbox start
.
In case the workflow failed:
Copy code
CalledProcessError: Command '['pyflyte-execute', '--inputs', 
'<s3://my-s3-bucket/metadata/propeller/flytesnacks-development-ffe430eb7d4f34ccba>
6f/n0/data/inputs.pb', '--output-prefix', 
'<s3://my-s3-bucket/metadata/propeller/flytesnacks-development-ffe430eb7d4f34ccba>
6f/n0/data/0', '--raw-output-data-prefix', 
'<s3://my-s3-bucket/data/5h/ffe430eb7d4f34ccba6f-n0-0>', '--checkpoint-path', 
'<s3://my-s3-bucket/data/5h/ffe430eb7d4f34ccba6f-n0-0/_flytecheckpoints>', 
'--prev-checkpoint', '""', '--dynamic-addl-distro', 
'<s3://my-s3-bucket/flytesnacks/development/LUGCKJRICUAR26UPBIJ7DJZVAE======/scri>
pt_mode.tar.gz', '--dynamic-dest-dir', '/root', '--resolver', 
'flytekit.core.python_auto_container.default_task_resolver', '--', 
'task-module', 'workflow', 'task-name', 'get_data']' died with <Signals.SIGKILL:
9>.
j
can you give docker more memory?
a
I can try. I will be back.
@jeev, it does not work. I still got the same error. I gave docker 12 GB of memory with 6 CPUs.
j
is the memory usage unexpected?
@Albert Wibowo can you describe the failed pod and see what the resource requests/limits were?
it might just be getting oomkilled because it was exceeding the task's limit
a
yeah it was unexpected because my workflow is quite simple actually. The workflow is simply loading dataset from scikit-learn and then visualising the data using flytedeck. The weird thing is that if I use docker deskop instead, with the same CPUs and memory dedicated to docker, it works just fine. Also, I am new to docker + kubernetes, How do i describe the failed pods?
So in summary: What works for me • Docker Engine + Colima + flytectl sandbox start • Docker Desktop + flytectl demo start What does not work for me • Docker Engine + Colima + flytectl demo start In all of these cases, I used the same workflow + CPUs + memory
j
kubectl get pods -A
to list all the pods
kubectl describe pod <podname> -n <namespace>
to describe it
thats pretty interesting though. why would a task come up and then get oomkilled?
a
You are right - one of the pods is getting oomkilled. And it looks like the pod is not registering the new CPUs and Memory size it seems.
Copy code
Exit Code:  1
   Started:   Thu, 15 Jun 2023 17:12:07 +0100
   Finished:   Thu, 15 Jun 2023 17:12:08 +0100
  Ready:     False
  Restart Count: 0
  Limits:
   cpu:   2
   memory: 200Mi
  Requests:
   cpu:   2
   memory: 200Mi
  Environment:
j
200Mi seems like very little
is it the same in
flytectl sandbox start
?
might be a config issue
y
isn’t this the default that’s set in admin?
j
ah i always forget this
try writing this to
~/.flyte/sandbox/config.yaml
and do
flytectl demo reload
Copy code
task_resources:
  defaults:
    cpu: "0"
    memory: "0"
  limits:
    cpu: "10"
a
alright will try it
I dont think I can do reload when I edit the config file directly:
Copy code
albertwibowo@Alberts-MacBook-Pro test-aw-flyte % flytectl demo reload
Error: 
strict mode is on but received keys [map[task_resources:{}]] to decode with no config assigned to receive them: failed strict mode check
ERRO[0000] 
strict mode is on but received keys [map[task_resources:{}]] to decode with no config assigned to receive them: failed strict mode check src="main.go:13"
j
where did you write the file to?
a
I tried writing it directly to either config.yaml and config-sandbox.yaml. Both gave me the same error when trying to run
flytectl demo reload
. One other thing that I tried was to do these steps: 1. flytectl demo start 2. modify sandbox-config.yaml 3. export SANDBOX_CONFIG = ~/.../config-sandbox.yaml 4. run my workflow This also did not work.
y
we really need to get rid of the defaults in there.
can you try adding a project/domain level override?
Copy code
domain: development
project: flytesnacks
defaults:
  cpu: "2"
  memory: "1Gi"
limits:
  cpu: "2"
  memory: "2Gi"
write that to a file then
Copy code
flytectl update task-resource-attribute --attrFile yourfile.yaml
a
Yup sure, I'll try it. Thank you so much for being patience.
yes it finally worked! Thanks @Yee @jeev. I guess I have to do this every time then? How come when I use docker desktop to increase the size of the CPU and memory flyte is able to update the config automatically?
y
@Eduardo Apolinario (eapolinario) can we fix this?
We just need to extract out the relevant parts of the pr and merge it
e
Sure thing.
Still not clear why this works in the case of the old sandbox. Do we have other overrides in that case?
j
need to test
flytectl sandbox start
is running with this config:
Copy code
task_resources:
  defaults:
    cpu: 500m
    memory: 1Gi
  limits:
    cpu: 2
    memory: 4Gi
    gpu: 5
that would explain
haha looked at sandbox-lite and realized that was the wrong one too!!
flytectl sandbox start
is using this build.
@Albert Wibowo: you can just run with this:
Copy code
> less -F ~/.flyte/sandbox/config.yaml
task_resources:
  defaults:
    cpu: 500m
    memory: 1Gi
  limits:
    cpu: 2
    memory: 4Gi
then:
flytectl demo reload
that will persist for as long as your config is at that path
those are probably more reasonable defaults for
flytectl demo start
too though. wdyt @Yee @Eduardo Apolinario (eapolinario)
245 Views