Dan Corbiani
02/14/2023, 4:10 PMflytectl demo start
2. Build the container with the tag localhost:30000/custom_container:v1
3. exported the config export FLYTECTL_CONFIG=/home/user/.flyte/config-sandbox.yaml
4. Tried to run the workflow pyflyte run --remote workflows/mvp.py mvp --input_s3_path="<s3://data/path>"
The error I get is that it cannot pull the image. Am a I missing something very basic here?Alex Papanicolaou
02/14/2023, 4:58 PMRezwan Abir
02/14/2023, 9:43 PMserialization_settings = SerializationSettings(
project="flytesnacks",
domain=self.settings.flyte_domain,
env=None,
image_config=ImageConfig(
default_image=Image(
name="custom_container_task",
fqn="<http://cr.flyte.org/flyteorg/flytekit:py3.10-1.3.2|cr.flyte.org/flyteorg/flytekit:py3.10-1.3.2>",
tag="image",
)
)
)
Rezwan Abir
02/14/2023, 9:44 PMcontainers with unready status: [f7592492fe4da4f14b2f-n0-0]|Failed to apply default image tag "<http://cr.flyte.org/flyteorg/flytekit:py3.10-1.3.2:image|cr.flyte.org/flyteorg/flytekit:py3.10-1.3.2:image>": couldn't parse image reference "<http://cr.flyte.org/flyteorg/flytekit:py3.10-1.3.2:image|cr.flyte.org/flyteorg/flytekit:py3.10-1.3.2:image>": invalid reference format
Vinícius Sosnowski
02/15/2023, 1:12 AMcontainer_cpu_usage_seconds_total
and kube_pod_container_resource_limits_memory_bytes
. I am trying to monitor these metrics for all pods that are generated by Flyte executions. It is kind of working but sometimes an execution is simply not scraped by Prometheus: the metric is empty. I have a workflow that runs daily and today's execution's metrics are there, but yesterday's are not. As it's the same workflow and it has not changed since yesterday I really don't know what could be happening. Can anyone help me?
This is the Prometheus config file regarding kubelet:
kubelet:
enabled: true
namespace: kube-system
serviceMonitor:
interval: ""
proxyUrl: ""
https: true
cAdvisor: true
probes: true
resource: false
resourcePath: "/metrics/resource/v1alpha1"
cAdvisorMetricRelabelings: []
probesMetricRelabelings: []
cAdvisorRelabelings:
- sourceLabels: [__metrics_path__]
targetLabel: metrics_path
probesRelabelings:
- sourceLabels: [__metrics_path__]
targetLabel: metrics_path
resourceRelabelings:
- sourceLabels: [__metrics_path__]
targetLabel: metrics_path
metricRelabelings: []
relabelings:
- sourceLabels: [__metrics_path__]
targetLabel: metrics_path
Rezwan Abir
02/15/2023, 6:34 AMSeung-Woo Lee
02/15/2023, 7:57 AM@task(secret_requests=[Secret(group="SOME_SECRET", key="DATA")])
def some_task() -> None:
...
os.environ["SOME_SECRET_DATA"] = flytekit.current_context().secrets.get("SOME_SECRET", "DATA")
...
It works fine on remote because k8s secrets are already set. However, it doesn’t works on local with command pyflyte run …
. When I use SecretsManager under if __name__ == "__main__"
clause and run with python pipeline.py
it works. But I want to make this pipeline works with pyflyte run …
command.honnix
02/15/2023, 9:01 AMflytectl sandbox
and flytectl demo
? The documentation looks very similar.Kamakshi Muthukrishnan
02/15/2023, 12:28 PMKlemens Kasseroller
02/15/2023, 12:38 PMchannels:
- conda-forge
dependencies:
- python =3.7
- flytekit==1.2.7
The error message looks like that:
conda-forge/linux-64 Using cache
conda-forge/noarch Using cache
error libmamba Could not solve for environment specs
Encountered problems while solving:
- package flytekit-1.2.7-pyhd8ed1ab_0 requires python >=3.8, but none of the providers can be installed The environment can't be solved, aborting the operation critical libmamba Could not solve for environment specs
Aswanth Krishnan
02/15/2023, 1:21 PMEd Fincham
02/15/2023, 3:42 PMpyflyte run -p testflyte --remote example.py training_workflow --hyperparameters '{"C": 0.1}'
I get a 403 error. There's a signed url, but this is rejected by the metadata bucket. The cluster itself has a flyte service account with read/write access to the bucket, but the above is all happening locally. Any ideas how I can debug this as I'm currently a bit stumped!
Thanks a lot 🙂Taylor Stout
02/15/2023, 6:06 PMterminated with exit code (137). Reason [OOMKilled].
pod failures when trying to run simple hello world tasks. We have flyte-binary deployed via its helm chart. I see the task pod is being spun up with a default memory limit of 200mb. I'm having trouble tracking down how to set the default pod resource spec to adjust the memory limit of the pod.Greg Dungca
02/15/2023, 8:25 PMBryan Weber
02/15/2023, 8:31 PMpyflyte package
+ flytectl register
flow to push the tasks to my flyte cluster. The tasks run fine, and the image does not need to include the actual task code.
Now for reusability, I’d like to break my task package into several modules and define a workflow in a different module, thus I need to import the tasks in the workflow module. Is there a way to make this work with the protobuf registration? I’m getting errors that the task module could not be found, which makes sense since the image doesn’t contain that code. It’d be nice not to have to rebuild my docker image every time a workflow/task changes, since most of the “business logic” code (upon which the tasks depend) is fairly stable and doesn’t require updates in the image too oftenBrandon Segal
02/15/2023, 10:58 PM@dynamic
workflows. If I create a dynamic workflow that invokes a set of tasks and one of those tasks fails but all the others succeeds, would it try to invoke the same set of tasks if I retrigger that same dynamic workflow?
My situation is the following:
• I want to compile a dbt project to read all the nodes in the dbt DAG and translate them to flyte tasks
• I do not want to reprocess earlier nodes in the DAG if one of the nodes fail.Yubo Wang
02/16/2023, 12:56 AMFhuad Balogun
02/16/2023, 8:04 AMPod failed. No message received from kubernetes.
Ena Škopelja
02/16/2023, 9:03 AMEd Fincham
02/16/2023, 1:26 PMdefaultIamRole
(and ideally `projectQuotaCpu`/`projectQuotaMemory` , if that's still possible) to the pods that run in dev/stag/prod namespaces across all projects. To do this, I'm using the configuration.inline.cluster_resources
entry (see prod values).
When I deploy the chart with these values, I can see that these exist under 010-inline-config.yaml
in the flyte-backend-flyte-binary-config configmap, but I don't think these values are picked up by the namespaces. For instance, I would expect the default sa in each namespace (dev/stag/prod) to have a role-arn
annotation, but it doesn't.
Have I missed something very obvious?Derek Yu
02/16/2023, 2:20 PMdatacatalog
when task caching is enabled?
I'm seeing Failed to retrieve artifact for get artifact request dataset
because of this error
err: missing entity of type Tag with identifier
using datacatalog version: cr.flyte.org/flyteorg/datacatalog-release:v1.3.0
Any insights into what the missing tag could mean would be great. Thanks! 🙏Taylor Stout
02/16/2023, 3:06 PMEvan Sadler
02/16/2023, 3:49 PMfrom flytekit import workflow, dynamic, task
@dataclass_json
@dataclass
class Params:
color: str
@task
def print_color(c: str):
print(c)
@dynamic
def wf(params: Params) -> Params:
print_color(c=params.color)
return inner
Victor Gustavo da Silva Oliveira
02/16/2023, 8:17 PMGreg Gydush
02/17/2023, 12:13 AMDan Corbiani
02/17/2023, 1:12 AMpyflyte image run
. When I try to run the container without the extracted pex files, I get a module not found error. This is expected because our dependencies aren't part of the container. The important part is the execution command works as expected.
When I try to run the command with our pex files, it says the workflow files aren't found. It's as if something within the python environment is broken or that it is not extracting the pb file correctly. This is the error:
[1/1] currentAttempt done. Last Error: USER::Pod failed. No message received from kubernetes.
[f728f923515794b7cb68-n0-0] terminated with exit code (1). Reason [Error]. Message:
zen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'workflows'
Traceback (most recent call last):
File "/opt/venv/bin/pyflyte-fast-execute", line 8, in <module>
sys.exit(fast_execute_task_cmd())
File "/opt/venv/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/opt/venv/lib/python3.9/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/opt/venv/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/venv/lib/python3.9/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/opt/venv/lib/python3.9/site-packages/flytekit/bin/entrypoint.py", line 513, in fast_execute_task_cmd
subprocess.run(cmd, check=True)
File "/usr/local/lib/python3.9/subprocess.py", line 528, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['pyflyte-execute', '--inputs', '<s3://my-s3-bucket/metadata/propeller/flytesnacks-development-f728f923515794b7cb68/n0/data/inputs.pb>', '--output-prefix', '<s3://my-s3-bucket/metadata/propeller/flytesnacks-development-f728f923515794b7cb68/n0/data/0>', '--raw-output-data-prefix', '<s3://my-s3-bucket/data/ox/f728f923515794b7cb68-n0-0>', '--checkpoint-path', '<s3://my-s3-bucket/data/ox/f728f923515794b7cb68-n0-0/_flytecheckpoints>', '--prev-checkpoint', '""', '--dynamic-addl-distro', '<s3://my-s3-bucket/flytesnacks/development/KRKCIUHF3ZIBQ6OBQ6FCX2PXRQ======/scriptmode.tar.gz>', '--dynamic-dest-dir', '/root', '--resolver', 'flytekit.core.python_auto_container.default_task_resolver', '--', 'task-module', 'workflows.cmi_mvp', 'task-name', 'base_mission_sim_etl']' returned non-zero exit status 1.
Our docker buildfile looks like the following:
FROM python:3.9-slim-buster as dependencies
WORKDIR /root
ENV VENV /opt/venv
ENV LANG C.UTF-8
ENV LC_ALL C.UTF-8
ENV PYTHONPATH /root
COPY src.cmi_orchestration/binary-deps.pex /binary-deps.pex
RUN PEX_TOOLS=1 /usr/local/bin/python /binary-deps.pex venv --scope=deps --compile /opt/venv
FROM python:3.9-slim-buster as sources
WORKDIR /root
ENV VENV /opt/venv
ENV LANG C.UTF-8
ENV LC_ALL C.UTF-8
ENV PYTHONPATH /root
COPY src.cmi_orchestration/binary-srcs.pex /binary-srcs.pex
RUN PEX_TOOLS=1 /usr/local/bin/python /binary-srcs.pex venv --scope=srcs --compile /opt/venv
FROM python:3.9-slim-buster as local-dev
WORKDIR /root
COPY --from=dependencies /opt/venv /opt/venv
COPY --from=sources /opt/venv /opt/venv
ENV VENV /opt/venv
ENV LANG C.UTF-8
ENV LC_ALL C.UTF-8
ENV PYTHONPATH /root
RUN apt-get update && apt-get install -y build-essential
RUN pip3 install awscli
ENV VENV /opt/venv
RUN python3 -m venv ${VENV}
ENV PATH="${VENV}/bin:$PATH"
ENV ENV_FOR_DYNACONF cluster
ARG tag
ENV FLYTE_INTERNAL_IMAGE $tag
Does anyone have any tips on what I can do to debug what is going on?
My assumption is the pb file is fine. If I run the same pyflyte
run command against a version of the container that doesn't contain the copy command, flyte seems to find the workflow files. I thought something might be getting corrupted with flyte so we tried installing it again later in the process with the awscli. That didn't have any impact. A pip freeze within the container shows the same set of requirements in both of the containers.
If I install the libraries normally it does work as expected. We were hoping to use the pex files as it significantly reduces the size of our containers and makes our deployment process easier.SeungTaeKim
02/17/2023, 8:52 AM@task(image=image1)
def task_a():
import some_package_only_in_1
return some_package_only_in_1()
@task(image=image2)
def task_n():
import some_package_only_in_2
return some_package_only_in_2()
Those modules between some_package_only_in_1&some_package_only_in_2
have dependency conflicts.
Thus, I should try to build multi containers for each tasks.
For example, I will run the entire workflow container with the name of workflow_image
. It does not contain both modules, some_package_only_in_1&some_package_only_in_2
.
Dose it make Runtime Error when I execute serialize source codes
because of both modules?
Thank you!Derek Yu
02/17/2023, 2:06 PMtask
to succeed ✅ but never run/schedule a pod ❌? Or know where to start troubleshooting? 🕵️♂️
-- More details --
The task in question has caching disabled, and is a map
task.
Have searched all flyte component logs for the executions and the only errors I see are these warning ⚠️ messages
Failed to fetch override values when assigning task resource default values
Failed to fetch override values when assigning execution queue
Trying to disable the overrides didn't help either. And the task repeatedly succeeds without ever running the task.
cc: @Heidi HurstAlexey Kharlamov
02/17/2023, 3:34 PMVolker Lorrmann
02/17/2023, 4:07 PMparallel_requests
is my own library, which uses aiohttp and asyncio, to run requests in parallel.
from parallel_requests import parallel_requests
from flytekit import task, workflow
@task
def download(urls:list)->list:
#return requests.get(url, headers={"user-agent":"my-user-agent"}).json()
return parallel_requests(
urls=urls
)
@workflow
def run():
urls = ["<https://query2.finance.yahoo.com/v7/finance/quote?symbols=AAPL>"]*10
res = download(urls=urls)
print(res[0])
if __name__=='__main__':
run()
Running the workflow with pyflyte run test_flyte_with_asyncio.py run
gives me the followin error:
Traceback (most recent call last):
File "/root/mambaforge/envs/flyte/lib/python3.10/site-packages/flytekit/core/type_engine.py", line 914, in get_literal_type
sub_type = TypeEngine.to_literal_type(self.get_sub_type(t))
File "/root/mambaforge/envs/flyte/lib/python3.10/site-packages/flytekit/core/type_engine.py", line 907, in get_sub_type
raise ValueError("Only generic univariate typing.List[T] type is supported.")
ValueError: Only generic univariate typing.List[T] type is supported.
What am I doing wrong here?
Thanks