Hi Flyte team! I'm traying to use the papermill pl...
# ask-the-community
g
Hi Flyte team! I'm traying to use the papermill plugin on a Kubernetes cluster that uses Istio. For that, I've also installed the pod plugin (to deal with the sidecar issue). For regular python tasks, the _task_type_ changes to
sidecar
task and the pod can be terminated as expected using the pod plugin. However, when running a workflow with a notebook task for papermill, the _task_type_ changes to
nb-sidecar
. After that, the pod for this task can not be terminated and I don't know why. It seems like the same issue when we launch a workflow without the proper settings for pod plugin. A code snipet for the task:
Copy code
def generate_por_spec_for_task():
    primary_container = V1Container(name="primary")
    pod_spec = V1PodSpec(containers=[primary_container])

    return pod_spec

nb = NotebookTask(
    name="simple-nb",
    task_config=Pod(pod_spec=generate_por_spec_for_task(), primary_container_name="primary"),
    notebook_path=os.path.join(
        pathlib.Path(file).parent.absolute(), "nb-simple.ipynb"
    ),
    inputs=kwtypes(v=float),
    outputs=kwtypes(square=float),
)
A log message in flytepropeller pod is the following:
{"json":{"exec_id":"ahkm5jd76h8l2gn46zt6","node":"n1","ns":"flyteexamples-development","res_ver":"209788243","routine":"worker-2","tasktype":"nb-sidecar","wf":"flyteexamples:development:flyte.workflows.simple.nb_to_python_wf"},"level":"warning","msg":"No plugin found for Handler-type [nb-sidecar], defaulting to [container]","ts":"2022-edited"}
Does anyone has insights about what could possibly going on?
k
Could you share your plugin config in the propeller config map?
g
Hi! Thanks for the quick response @Kevin Su! The values are the same that you shared: flyte-propeller-config enabled_plugins.yaml
Copy code
tasks:
  task-plugins:
    default-for-task-types:
      container: container
      container_array: k8s-array
      sidecar: sidecar
    enabled-plugins: 
      - container
      - sidecar
      - k8s-array
There is nothing specific for nb-task or papermill plugin...
k
cc @Calvin Leather mind taking a look? did you run into same issue before
👀 1
c
I haven't seen this issue in particular, but we also aren't running Istio (the only other things we have on the nodes with the nb task pods are daemon set pods). This "can not be terminated" behaviour only happens with notebook tasks right? "The task can not be terminated" -> You mean the underlying pod can't be deleted using kubectl etc.? Or you can't terminate the task's execution from the GUI?
Need to check this, but I think the handler normally defaults to [container] in our non-sidecar executions as well (i.e., no propeller plugin to for nb execution)
Right, In our execut*i*ons of the nb task, without the sidecar plugin, we see
No plugin found for Handler-type [nb-python-task], defaulting to [container]
, because the nb tasks just use the default container plugin, they don't need any special plugin in flyte propeller (all nb execution code lives in flytekit). I wonder if it needs to default to
sidecar
in this case instead of just
container
? Happy to investigate that line of thought more later today, or w/ some more info on the observed behaviour
g
Hi @Calvin Leather! Thanks for your time! Yes, this behavior is just for the notebook_task. Running it locally, the workflow is OK and gives the right outputs. However, running it on the cluster, the sidecar keeps running forever. From the GUI, the notebook_task status is "unknown" and I've to terminate the workflow from the GUI. I've added a regular python task (with the config_task for pod plugin) just before the notebook task in the same workflow to do some experiments. The regular python task is executed, after ending the task, kubernetes automatically terminate the pod for that task, as expected. The following task is the notebook_task, but when it is running, the task gets stuck and process is not finished. I've tried a naïve approach to edit the config map to map the `nb-sidecar`task as a
sidecar
task. It didn't work. But I think that it just don't work like that. I don't know how to change the default task to
sidecar
instead of
container
... Maybe worth trying that to see what happens...
👀 1
c
Cool thanks for the description! I haven't touched the
sidecar
task type all that much, but presumably propeller must have some logic for those to tear down the sidecar that isn't getting used here. I suspect there is a bug here and we need to tweak something so that propeller uses a sidecar-aware plugin, and happy to help continue debugging, but will be slower since I'm new to this part of the code. I can think of an inelegant workaround if you need this quickly: you could use a plain python task and use the python papermill library in it to run your notebook.
👍 1
g
Thanks! I'll try this workaround for now 🙂
k
@Guilherme this seems like Flyte is unable To detect primary container exit. So k8s sidecar block pod termination and Flyte intervenes. But need to see why this did not work
Do you have example code i can try
Ohh nm I see the bug
We should not use nb-sidecar in task/type it should be sidecar or pod. Can you share your code snippet
Cc @Kevin Su can you check this
👀 1
k
okay. it works after I removed
nb-
from task type
👍 1
k
Ya the task type is not useful
g
Thanks @Ketan (kumare3) and @Kevin Su! I'll try some alternatives!
k
You don’t need alternatives. Kevin is putting in a change
👍 1
g
Thank you all! 😀
k
@Guilherme We’ve fixed it. Feel free to try it.
Copy code
pip install git+<https://github.com/flyteorg/flytekit@papermill-bug>
https://github.com/flyteorg/flytekit/pull/1143
164 Views