acoustic-carpenter-78188
03/10/2024, 11:35 AM# main.py
import subprocess
import typing
# copy-paste from <https://github.com/flyteorg/flytekit/blob/f16ac4910043a56de235d8dc1383996b6ddd13ef/flytekit/extras/tasks/shell.py#L102-L123>
def _run_script(script) -> typing.Tuple[int, str, str]:
process = subprocess.Popen(script, stdout=subprocess.PIPE, stderr=subprocess.PIPE, bufsize=0, shell=True, text=True)
out = ""
for line in process.stdout:
print(line)
out += line
code = process.wait()
return code, out, process.stderr.read()
print(_run_script("python error_creator.py"))
# error_creator.py
import sys
for i in range(200000):
sys.stderr.write("This is an error message\n")
print("This is the output of the program")
Notice that running python error_creator.py
on its own finishes instantly, but running python main.py
hangs.
If you reduce the number of iterations in error_creator.py
to 2000, you'll see that python main.py
finishes instantly too.
I originally found this issue in a production Flyte deployment, and used strace
and lldb
to verify that the issue is caused by the pipe filling up. Manually reading from the pipe got rid of the deadlock in my case. I ran a command like: cat /proc/12484/fd/2
flyteorg/flyteacoustic-carpenter-78188
03/10/2024, 11:35 AM