Dan Rammer (hamersaw)
06/27/2023, 6:58 PMMick Jermsurawong
06/27/2023, 8:42 PM@task(cache=True, cache_serialize=True, cache_version="v5")
def sql_from_file(...)
...
@dynamic
def evaluate(params1, params2, params3, params4) -> Dict:
....
for p1 in params1:
for p2 in params2:
for p3 in params3:
for p4 in params4:
res = sql_from_file(p1, p2, p3, p4)
paramsX
is a large listetcdserver: request is too large
)kubectl describe <http://flyteworkflows.flyte.lyft.com|flyteworkflows.flyte.lyft.com> <workflow name> -n <namespace>
1.4k it doesn't seem like a very large workflowThanks Dan for the check there! I did inspect the CRD, and the problem is that there are many failing tasks, and the CRD does record
Message: Traceback
with large stacktrace.. (we delegated to jvm process run on Flyte pod) For a small example of 100 tasks, our CRD spans 14k linesDan Rammer (hamersaw)
06/28/2023, 4:08 PMMessage
field is very large - does this look like the correct field? There is a maxSize of 1024 that is being set on that, just want to make sure it's being correctly applied. Would it help if that maxSize was configurable? I suspect that dropping it to 256 would help a lot here?Mick Jermsurawong
06/28/2023, 4:47 PMError
dn119:
Task Node Status:
P State: ...
Phase: 8
Psv: 1
Upd At: 2023-06-23T16:45:44.229049302Z
Dynamic Node Status:
Error:
Code: USER:Unknown
Kind: USER
Message: Traceback (most recent call last):
File "/app/src/python/flyte/project_balance/balance_backtest/py_balance_backtest.binary.runfiles/third_party
...
Failed to execute Spark job.
Using JVM launcher from spark_runner.sh script...
Dan Rammer (hamersaw)
06/28/2023, 5:44 PMError
field in the events (here). So the only way to currently get the full message is to view it in the CR, however I do believe the max value in the event is 100kb so it is still quite large.