So I'm testing our `flyte-binary` `v1.12.0` deploy...
# flyte-support
r
So I'm testing our
flyte-binary
v1.12.0
deployment and it seems that it's using a lot of memory before being evicted by k8s - see attached screenshot of GCS CPU and Memory usage of the deployment. Could this be related to this issue? What could we do to track down the reason for the memory usage? Details below What I'm doing is submitting the workflow
sleep_more_minutes
about 300 times using
pyflyte run --remote -p flytesnacks -d development testing.py sleep_more_minutes
Copy code
from flytekit import workflow
from flytekit import task
import time

@task
def sleep_a_minute(seconds:int=60) -> int:
    time.sleep(seconds)
    return seconds

@workflow
def sleep_ten_minutes():
    seconds = sleep_a_minute()
    seconds = sleep_a_minute(seconds)
    seconds = sleep_a_minute(seconds)
    seconds = sleep_a_minute(seconds)
    seconds = sleep_a_minute(seconds)
    seconds = sleep_a_minute(seconds)
    seconds = sleep_a_minute(seconds)
    seconds = sleep_a_minute(seconds)
    seconds = sleep_a_minute(seconds)
    seconds = sleep_a_minute(seconds)


@workflow
def sleep_more_minutes():
    sleep_ten_minutes()
    sleep_ten_minutes()
    sleep_ten_minutes()
    sleep_ten_minutes()
    sleep_ten_minutes()
    sleep_ten_minutes()
    sleep_ten_minutes()
    sleep_ten_minutes()
    sleep_ten_minutes()
    sleep_ten_minutes()
    sleep_ten_minutes()
    sleep_ten_minutes()
    sleep_ten_minutes()
    sleep_ten_minutes()
    sleep_ten_minutes()
    sleep_ten_minutes()
    sleep_ten_minutes()
    sleep_ten_minutes()
    sleep_ten_minutes()
    sleep_ten_minutes()
f
Is this the backend process?
That should not happen - Flyte can run in tiny amounts of memory
r
Yes, that is the
flyte-backend-flyte-binary
deployment
FWIW: I re-ran the test with
inline.plugins.k8s.inject-finalizer: false
because it was suggested in this slack thread
I also downgraded flyte-binary to 1.11.0 just to see if it would make a difference but the behaviour is unfortunately the same First memory peak is with 1.12.0 and second one with 1.11.0
Might be interesting for somebody else. Setting
flyteadmin.useOffloadedWorkflowClosure=true
found here leads to these memory overflows having no real impact on flyte workflow execution. So right now our instance keeps getting evicted every ~20 minues. But at least our workflows are chugging along...
f
This is interesting. I guess this is because itโ€™s using the blobstore cache in single binary as its unified config
r
My issue is seems extremely similar to https://github.com/flyteorg/flyte/issues/3991
v
@ripe-smartphone-56353, this has been a perennial issue that affected single-binary since its inception. The summary of my investigation can be found https://github.com/flyteorg/flyte/issues/3991#issuecomment-2317974771 (with a corresponding fix).
r
Nice, thanks so much. We'll probably switch to
flyte-core
anyway but this will buy us some time if it works as expected. Can we expect this to be released anytime soon -> I guess we can use the docker image that is being built by github actions anyway ๐Ÿ‘
๐Ÿ‘ 1