Hi team, is there any limit over the number of tas...
# ask-the-community
z
Hi team, is there any limit over the number of task a certain workflow can get scheduled? I first created a workflow with 100 node. Then created 10 execution of this workflow. Yet the max parallel tasks plateaus at 300. Another question is that even if I tried to set maxWorkflowNodes in admin configuration, the setting does not takes affect and it prevents me from registering workflow with more than 100 nodes. Are any of these two behavior expected?
k
Hmm you have to restart admin after changing the settings
Also if you make the workflow really large we should start offloading the workflows (this is supported via a flag)
z
Of course flyteadmin gets restarted. I asked this question specifically because I noticed 100 gets hard-coded in the config initiation. After removing this one line the setting kicks in. Is this expected?
What is offloading workflow? And how many nodes would be considered as large for a single workflow?
k
Offloading is for really large workflows - maybe 10k+ nodes or most Importantly size more than 1MB. Eventually we will start offloading all workflows
But the 100 node limit sounds odd, you have to set else it will default to 100
That’s the hard coding - but to confirm - cc @Eduardo Apolinario (eapolinario)
z
Copy code
apiVersion: v1
  6 data:
  7   cluster_resources.yaml: |
  8     cluster_resources:
  9       customData:
 10       - production:
 11         - projectQuotaCpu:
 12             value: "50000"
 13         - projectQuotaMemory:
 14             value: 4000Gi
 15       - development:
 16         - projectQuotaCpu:
 17             value: "40000"
 18         - projectQuotaMemory:
 19             value: 30000Gi
 20       refresh: 5m
 21       refreshInterval: 5m
 22       standaloneDeployment: false
 23       templatePath: /etc/flyte/clusterresource/templates
 24   db.yaml: |
 25     database:
 26       dbname: flyteadmin
 27       host: aaa
 28       passwordPath: /etc/db/pass.txt
 29       port: 5432
 30       username: flyte
 31   domain.yaml: |
 32     domains:
 33     - id: development
 34       name: development
 35     - id: production
 36       name: production
 37   logger.yaml: |
 38     logger:
 39       level: 5
 40       show-source: true
 41   registration.yaml: |
 42     maxWorkflowNodes: 0
for more reference here is the configmap I used for admin service. I set the maxWorkflowNodes in the registration section
Is there any docs over offloading that we can read to see what’s going on under the hood?
k
It puts the workflow definition in s3 and then uses caching in the engine to keep it in memory
But will Share more info - was added by Spotify and union. Cc @Dan Rammer (hamersaw)
z
The relevant question is that I am trying to schedule several workflow that adds up to 1000 tasks in total. However the max number of pods I can create is 300. I’ve tried to set workers, maxParallelism and kube-client-config. Neither of these would boost the number of max pods. Do you have any idea what I am missing here?
actually looks like it’s the project quota?
let me give it a quick try
k
Ohh ya please Delete project quota
We are removing in the default case
You can add it to control
d
@Zhiyi Li the docs for CRD offloading are here
154 Views