I have two questions: 1. How can I monitor the std...
# ask-the-community
r
I have two questions: 1. How can I monitor the std out and std err while the task is running ? 2. Does
pyflyte register
not use the docker image when its packaging the workflow. We have some custom dependencies for the workflow to execute that aren’t present in the development environment.
f
regarding viewing the logs: you need to write the logs to a provider (i.e. from your cloud provider or something separate), then you can configure links to view the logs: https://docs.flyte.org/projects/cookbook/en/latest/auto/deployment/configure_logging_links.html I e.g. have Grafana Loki set up and can then view live logs via that link
could also be a link to kubernetes dashboard if you don't need persistence
r
Well my hope was to stream the stdout / stderr out eventually
I can probably find a way to jerry-rig stuff but I was hoping that I wouldn’t have to figure out some of these pieces (I’m still new to Kubernates)
f
stream to where? I assume you want to see this live from tasks executed in a remote kubernetes cluster?
r
@Felix Ruess you got it right ! I wanted to watch the stdout / stderr live. Logs are fine for now (I’ll look at how to set it up). But I’d eventually want to be able to see the stdout live on the dashboard itself
f
yeah, so for me the simplest was to deploy a kubernetes dashboard and configure the link to it, so you can just click on it in the flyte console... But that only works until the pod is deleted... if you want to view logs of pods that were already cleaned up, you need something else to persist the logs... I like Grafana/Loki... Where are you running your cluster? managed in AWS or so, or locally?
r
I’ve streamed out the std out from processes onto dashboards before, I’m just trying to make flyte become the manager for running the processes
Right now locally, I’ll be moving it to AWS soon
Well as soon as I get it working 😄
f
so Flyte does not do this for you out-of-the-box.... But you should setup something to stream the logs to... so you can view them live or later.. Usually that is the cloud provider log storage, then configure flyte to provide a link to that.
r
Yeah, I can adapt what I already have to do this. I’ll just need to navigate how the flyte cluster can access an external service and a more elegant way to capture all the stdout / stderr in the image
f
or you use something cloud provider agnostic.. e.g. I deployed the Grafana Agent in my local cluster, which "streams" logs (and metrics) to Grafana Cloud (or my local Loki/prometheus) and I can live view that and also query it later after the pods are already deleted in k8s
r
Grafana could work
y
wrt 2. - yeah you’ll have to specify the image if you don’t want the default. use the
--image
flag.
thanks @Felix Ruess!
r
Huh odd, I dont think its working but maybe I’m doing something wrong. I’ll run the docker image and see if works
@Yee It’s seems like it isn’t trying to package it inside of the docker container. Here’s the command I’m using:
Copy code
pyflyte register workflows --image localhost:30000/retsynth:cc6bf85d7bbadba83419f40003cc9a16f962fda7
However I it looks like its using the local python installation from the log. Also this package isn’t present in my dev machine but it is present in the image. Here’s the rest of the error dump
Copy code
Running pyflyte register from /root/sandbox/retsynth-history-fix/retsynth_new with images ImageConfig(default_image=Image(name='default', fqn='localhost:30000/retsynth', tag='cc6bf85d7bbadba83419f40003cc9a16f962fda7'), images=[Image(name='default', fqn='localhost:30000/retsynth', tag='cc6bf85d7bbadba83419f40003cc9a16f962fda7')]) and image destination folder /root on 1 package(s) ('/root/sandbox/retsynth-history-fix/retsynth_new/workflows',)
Registering against localhost:30080
Detected Root /root/sandbox/retsynth-history-fix/retsynth_new, using this to create deployable package...
No output path provided, using a temporary directory at /tmp/tmpyq__xk3m instead
Computed version is k_CYhVf7JigGDT-ocAa7qg==
Loading packages ['workflows'] under source root /root/sandbox/retsynth-history-fix/retsynth_new
Traceback (most recent call last):
  File "/root/.pyenv/versions/3.8.0/bin/pyflyte", line 8, in <module>
    sys.exit(main())
  File "/root/.pyenv/versions/3.8.0/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/root/.pyenv/versions/3.8.0/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/root/.pyenv/versions/3.8.0/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/root/.pyenv/versions/3.8.0/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/root/.pyenv/versions/3.8.0/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/root/.pyenv/versions/3.8.0/lib/python3.8/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/root/.pyenv/versions/3.8.0/lib/python3.8/site-packages/flytekit/clis/sdk_in_container/register.py", line 184, in register
    raise e
  File "/root/.pyenv/versions/3.8.0/lib/python3.8/site-packages/flytekit/clis/sdk_in_container/register.py", line 168, in register
    repo.register(
  File "/root/.pyenv/versions/3.8.0/lib/python3.8/site-packages/flytekit/tools/repo.py", line 255, in register
    serializable_entities = load_packages_and_modules(
  File "/root/.pyenv/versions/3.8.0/lib/python3.8/site-packages/flytekit/tools/repo.py", line 179, in load_packages_and_modules
    registrable_entities = serialize(pkgs_and_modules, ss, str(project_root), options)
  File "/root/.pyenv/versions/3.8.0/lib/python3.8/site-packages/flytekit/tools/repo.py", line 46, in serialize
    module_loader.just_load_modules(pkgs=pkgs)
  File "/root/.pyenv/versions/3.8.0/lib/python3.8/site-packages/flytekit/tools/module_loader.py", line 33, in just_load_modules
    importlib.import_module(name)
  File "/root/.pyenv/versions/3.8.0/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
    from tqdm import tqdm
ModuleNotFoundError: No module named 'tqdm'
I also tried an interactive session with on the docker I pointed to here and it was working fine for me
I could basically run the workflow and it had no import errors
y
we used to run all these from within the container. and you still can if you want to. just make sure you can hit the admin host and you’re good. (you’ll still need to provide the image argument)
but basically we found that users typically had a local dev environment set up with all their dependencies
and it was much easier for them to register from a local virtualenv than starting a docker image, mounting in their code (or rebuilding the image every time the code changed)
r
So the image provides the runtime but to generate the workflow package the local (or whatever packaging environment is also supposed to have the ability to have access to all the modules ? / depenendencies at a python level ?)
y
yes… again this is optional.
you can do everything from within the image if you want to
but most users find that cumbersome
r
If so thats fine, I do have a devcontainer that isolates the project dependencies
But I’m guessing it wont have access to the sandbox cluster. If I use package, would that be equivalent ? and would I be able to deploy the package after ?
y
can you elaborate?
what do you mean by “it”?
and which package are you referring to?
r
Sorry I was thinking out loud. So can I package the workflow in one step and then deploy that in a separate step.
y
yes for sure.
but this packaging step… you can run it in a container, or not in a contaienr.
that is your choice, but i think running it locally is easier.
r
Yup got it
y
if you run it from within a container image you can still access the sandbox (might have to run it with host networking but it should be possible)
r
I think I’m making some progress on this thanks !
f
How can I monitor the std out and std err while the task is running ?
Do
kubectl get pods --namespace flytesnacks-development
(or whichever namespace your task is running in. If you don’t know, you can do
kubectl get pods --all-namespaces
. Find out the name of your pod running the task. Then,
kubectl --namespace flytesnacks-development logs -f <pod name>
(remove the
<
). This streams the logs of your task.
317 Views