Hey all -- currently working on attaching a sampli...
# flyte-support
r
Hey all -- currently working on attaching a sampling Python call stack profiler as a sidecar to a Flyte task. This entails: • Creating a new PodSpec within
flytekit
that asserts shared process namespace for the
primary
and
profiler
containers in the pod and adding some e.g. additional capabilities ◦ This ensures any sampling profiler can read the Task Python process'
proc/$pid/maps
◦ Easy enough, that part is done! • Knowing what
PID
is the user task being run. This would entail e.g.
pgrep -f <entrypoint_for_task>
My problem arises in that last bullet point -- what process name should I be attempting to get the
PID
of?
• I first, naively, tried
pgrep -f pyflyte-execute
given that this is the entrypoint into the lifecycle of a task. This likely spins off separate processes for the user code itself, though, and as such does not work Once
dispatch_execute
is called somewhere within
pyflyte-execute
, what would be the appropriate process to grep for? Would it be simpler to attempt a different approach, such as a custom
PythonFunctionTask
that dumps the task objects PID to a shared volume?
f
If you are fast registering it should only be the child process of PID1
r
In practice, we rarely fast register. Will run
pstree
in container and see what I can see
f
My preference would be to attach the profiler at pyflyte-execute entrypoint
r
The spawned process appears to be consistently PID 7
(grepping for
pyflyte-execute
)
h
Once
dispatch_execute
is called somewhere within
pyflyte-execute
since you're not fast registering, you'll find that there's only a single process in the invocation of dispatch_execute. keep in mind that we _do_ subprocess in the case of fast register. Can we take a few steps back and talk a bit about your use case? Double-clicking on Ketan's suggestion, why can't you invoke the profiler programmatically as part of the entrypoint setup?
r
Got it - thank you for the details. The use case is just identifying our hot paths in individual tasks with a sampling profiler, and (eventually) render them out as flamegraphs in a Deck. We are currently using Austin to sample the task process frame stack, injecting a container w/ Austin into our (rather extended) internal
task
decorator's custom
PodSpec
, and... yeah that is more or less it. We could attach the profiler programmatically at entrypoint though I don't necessarily see the advantages. Other than filtering out some extra noise from the samples. By working at the `task`/`PodSpec` level we can multiple different internally wrapped tasks that our researchers can pick and choose from --
@mem_profile_task
, etc.
c
Michael, FYI we're about to land https://github.com/flyteorg/flytekit/pull/2875, which brings in memray as an option for profiling. This works at the task level (so we don't need to worry about figuring out PIDs). Enabling this will be as easy as adding the ​`@memray­_profiling`​ decorator. Let me know if this fits your use case.
r
Awesome! We added memray to our task wrappers internally already, in an almost equivalent fashion — only difference is we have profiling as a task arg/flag and we materialize the flame graph by reaching into the Memray internals The sampling profiler here within a sidecar “worked,” but we currently have decided not to worry about looking for PIDs or the like and sticking with memray for now.