hey everyone, i'm trying to run ray on the flyte s...
# ask-the-community
j
hey everyone, i'm trying to run ray on the flyte sandbox deployment, and am running into the following error. it looks like its basically an OOM error, but i'm not sure how to increase the default size of
/dev/shm
. what's the best approach to bumping this default in the sandbox deployment?
Copy code
/usr/local/lib/python3.8/dist-packages/paramiko/transport.py:236: CryptographyDeprecationWarning: Blowfish has been deprecated
  "class": algorithms.Blowfish,
2022-09-14 18:10:43,190	WARNING services.py:1882 -- WARNING: The object store is using /tmp instead of /dev/shm because /dev/shm has only 67108864 bytes available. This will harm performance! You may be able to free up space by deleting files in /dev/shm. If you are inside a Docker container, you can increase /dev/shm size by passing '--shm-size=0.10gb' to 'docker run' (or add it to the run_options list in a Ray cluster config). Make sure to set this to more than 30% of available RAM.
2022-09-14 18:10:46,611	INFO worker.py:1509 -- Started a local Ray instance. View the dashboard at [1m[32m<http://127.0.0.1:8265> [39m[22m
[2022-09-14 18:10:57,864 E 1 1] <http://core_worker.cc:149|core_worker.cc:149>: Failed to register worker 01000000ffffffffffffffffffffffffffffffffffffffffffffffff to Raylet. IOError: [RayletClient] Unable to register worker with raylet. No such file or directory
k
Running ray on sandbox maybe too much of memory required
But you should definitely increase docker memory
j
oh okay, so this should be done via the base Dockerfile instead of via
flytectl
?
oh or do you mean via Docker Desktop?
k
Ya
j
is there a way to do it that addresses the error here:
Copy code
Error: rpc error: code = InvalidArgument desc = Requested MEMORY default [3500Mi] is greater than current limit set in the platform configuration [1Gi]. Please contact Flyte Admins to change these limits or consult the configuration
i'm not sure how to interact with
flyteadmin
, is this doable in the sandbox?
my guess is that i should be able to bump the memory in
config-sandbox.yaml
, but i'm not seeing a full list of parameters for the config anywhere
e
@James Evers,
Copy code
$ kubectl -n flyte edit cm flyte-admin-base-config
...
  task_resource_defaults.yaml: |
    task_resources:
      defaults:
        cpu: 100m
        memory: 500Mi
        storage: 500Mi
      limits:
        cpu: 2
        gpu: 1
        memory: 1Gi <- change this value
        storage: 20Mi
j
ahh! perfect, thank you!
as a general rule, is most of the cluster resource management done via
kubectl
rather than
flytectl
?
e
correct, at least for these cluster-wide limits, that's the case.
flytectl
is more about interacting with a flyte deployment, more specifically with its api (via flyteadmin)
j
okay, makes sense. one more super newbie question: is this the full list of parameters for the deployment config?
e
No, the full list is the helm charts we publish (you can take a look at them here). We have documentation on the different types of deployment we support in https://docs.flyte.org/en/latest/deployment/index.html (notice the sections on helm charts in each deployment guide)
j
okay great, thanks for the help!
e
sure thing and don't hesitate to reach out in case you encounter an issue.
k
@James Evers we are working on some deployment streamlining to help Configure Flyte faster - stay tuned and please provide feedback / file Issues
j
sounds good! looking forward to it
258 Views