Xin Shi
03/17/2023, 7:41 PMAnindya Saha
03/17/2023, 10:24 PM"Error syncing pod, skipping" err="failed to \"StartContainer\" for \"f154ee96e6c1a4fed852-n0-0\" with ImagePullBackOff: \"Back-off pulling image \\\"<http://cr.flyte.org/flyteorg/flytekit:py3.10-1.2.11\\\|cr.flyte.org/flyteorg/flytekit:py3.10-1.2.11\\\>"\"" pod="flytesnacks-development/f154ee96e6c1a4fed852-n0-0" podUID=5952591c-fde8-47bd-a9c4-2885a01915a7
E0317 22:24:01.121900 71 pod_workers.go:951] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"f154ee96e6c1a4fed852-n0-0\" with ImagePullBackOff: \"Back-off pulling image \\\"<http://cr.flyte.org/flyteorg/flytekit:py3.10-1.2.11\\\|cr.flyte.org/flyteorg/flytekit:py3.10-1.2.11\\\>"\"" pod="flytesnacks-development/f154ee96e6c1a4fed852-n0-0" podUID=5952591c-fde8-47bd-a9c4-2885a01915a7
How do I resolve this ?Ryo M
03/18/2023, 1:19 PMHarmen van Rossum
03/19/2023, 6:23 PM.with_overrides(container_image='image:version')
and using a decorator for the task, but both approaches don’t seem to do anything (just the default image is used). What’s the appropriate way to do this? Thanks!Ryo M
03/20/2023, 5:30 AMOOMKilled
error, and I found many similar questions in this channel. According to the discussion, the reason is resource shortage, so I set @task(limits=Resources(storage="20Gi", ephemeral_storage="20Gi", mem="40Gi", cpu="6"))
which is enough for the error task. But I still got the same error. I do not use GPU. Any other reasons for this?
[1/1] currentAttempt done. Last Error: USER::Pod failed. No message received from kubernetes.
[f52215af773e248b0af3-n2-0] terminated with exit code (1). Reason [OOMKilled]. Message:
asctime": "2023-03-20 05:09:46,909", "name": "flytekit", "levelname": "WARNING", "message": "FlyteSchema is deprecated, use Structured Dataset instead."}
tar: Removing leading `/' from member names
{"asctime": "2023-03-20 05:09:48,309", "name": "flytekit", "levelname": "WARNING", "message": "FlyteSchema is deprecated, use Structured Dataset instead."}
...
Nan Qin
03/20/2023, 3:23 PM@task
def t1() -> FlyteFile:
# do something and save checkpoint to path/to/file
return FlyteFile("path/to/file")
def t2(input: FlyteFile) -> float:
checkpoint = torch.load(input)
# do something
return 1.0
@workflow
def wf() -> float:
t1_output = t1()
return t2(t1_output)
when executed, t1 output is stored in S3. In t2 I get the following error
[Errno 2] No such file or directory: '/tmp/flytemlfm4mk1/local_flytekit/0569821759669848f705b8863309a4bb/checkpoint-epoch0001.pth'
it seems flyte dowloaded the file from S3 to a local temp folder, not sure why the file does not exist. any ideas?George Horrell
03/20/2023, 11:01 PMpyflyte
vs flytectl
question; is it possible to package and register from a flyte-package.tgz
as two separate steps, all within pyflyte
? Or is it necessary to pass the flyte-package.tgz
file to flytectl after packaging in pyflyte
?seunggs
03/21/2023, 2:10 AMBen Rosand
03/21/2023, 11:45 AMEduardo Matus
03/21/2023, 2:06 PMworkflow_notifications:
enabled: true
config:
notifications:
type: aws
region: "{{ .Values.userSettings.accountRegion }}"
publisher:
topicName: "arn:aws:sns:{{ .Values.userSettings.accountRegion }}:{{ .Values.userSettings.accountNumber }}:flyte_notification"
processor:
queueName: flyte_notification
accountId: "{{ .Values.userSettings.accountNumber }}"
emailer:
subject: "Notice: Execution \"{{ name }}\" has {{ phase }} in \"{{ domain }}\"."
#/{{ `{{` }} domain {{ `}}` }}/{{ `{{` }} launch_plan.name {{ `}}` }} has '{{ `{{` }} phase {{ `}}` }}'"
sender: "<mailto:flyte-notification@mydomain.com|flyte-notification@mydomain.com>"
Aditya Sharma
03/21/2023, 2:40 PMMartin Tschechne
03/21/2023, 3:41 PMNone
values in conditionals? I know about is_true()
, is_false()
and is_()
, but checking for None
seems not to be supported yet? Any help appreciated 🙏 Thanks!Ketan (kumare3)
Visak
03/21/2023, 11:01 PMJay Ganbat
03/21/2023, 11:10 PMcreate_node
to create dependencies or use chaining Flyte entity
seunggs
03/22/2023, 3:47 AMError: rpc error: code = Unimplemented desc = unexpected HTTP status code received from server: 404 (Not Found); transport: received unexpected content-type "text/plain; charset=utf-8"
Maybe this error is unrelated? Not sureseunggs
03/22/2023, 3:48 AMmykyta luzan
03/22/2023, 10:07 AMsecrets/common/flyte
but it’s not clear to me what is the SECRET_GROUP here? common/flyte
?Ferdinand von den Eichen
03/22/2023, 2:16 PMBroder Peters
03/22/2023, 2:59 PMVisak
03/22/2023, 5:11 PMarray size > max allowed. requested [13947]. allowed [5000].
How can I address this? Is there a way I can increase this size or is there a workaround to the size limit? Appreciate any help!Tim Sheiner
03/22/2023, 8:49 PM[1/1] currentAttempt done. Last Error: USER::Pod failed. No message received from kubernetes.
[ass5zvhclkmtr9hd55ks-n0-0] terminated with exit code (1). Reason [Error]. Message:
trap>", line 688, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 883, in exec_module
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File "/root/workflows/ex_analytics.py", line 4, in <module>
import pycountry
ModuleNotFoundError: No module named 'pycountry'
Traceback (most recent call last):
File "/usr/local/bin/pyflyte-fast-execute", line 8, in <module>
sys.exit(fast_execute_task_cmd())
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/flytekit/bin/entrypoint.py", line 513, in fast_execute_task_cmd
subprocess.run(cmd, check=True)
File "/usr/local/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['pyflyte-execute', '--inputs', '<s3://my-s3-bucket/metadata/propeller/flytesnacks-development-ass5zvhclkmtr9hd55ks/n0/data/inputs.pb>', '--output-prefix', '<s3://my-s3-bucket/metadata/propeller/flytesnacks-development-ass5zvhclkmtr9hd55ks/n0/data/0>', '--raw-output-data-prefix', '<s3://my-s3-bucket/data/7x/ass5zvhclkmtr9hd55ks-n0-0>', '--checkpoint-path', '<s3://my-s3-bucket/data/7x/ass5zvhclkmtr9hd55ks-n0-0/_flytecheckpoints>', '--prev-checkpoint', '""', '--dynamic-addl-distro', '<s3://my-s3-bucket/flytesnacks/development/TPFR2TGNNPMEAIZGLYXBVJ46S4======/scriptmode.tar.gz>', '--dynamic-dest-dir', '/root', '--resolver', 'flytekit.core.python_auto_container.default_task_resolver', '--', 'task-module', 'workflows.ex_analytics', 'task-name', 'clean_data']' returned non-zero exit status 1.
.
Anybody understand this?Eduardo Matus
03/23/2023, 12:13 AMInit Containers:
flytescheduler-check:
Container ID: <docker://a8fd892170ade6a0395de645ec513dd810c8d6af23aee1b6cefda29fd3c09aa>c
Image: <http://cr.flyte.org/flyteorg/flytescheduler-release:v1.2.0-b1|cr.flyte.org/flyteorg/flytescheduler-release:v1.2.0-b1>
Image ID: <docker-pullable://cr.flyte.org/flyteorg/flytescheduler-release@sha256:324a09a2a7ccd3dd50f334ac83ef183f83af2840c87807c562e506055a48e0ba>
Port: <none>
Host Port: <none>
Command:
flytescheduler
precheck
--config
/etc/flyte/config/*.yaml
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 2
Started: Wed, 22 Mar 2023 21:04:50 -0300
Finished: Wed, 22 Mar 2023 21:04:54 -0300
Ready: False
Restart Count: 7
Environment: <none>
Mounts:
/etc/db from db-pass (rw)
/etc/flyte/config from config-volume (rw)
/etc/secrets/ from auth (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-wxvbg (ro)
Anyone can give me some directions?seunggs
03/23/2023, 2:07 AMpending
state. Is this an expected behavior?seunggs
03/23/2023, 2:08 AMLeiqing
03/23/2023, 8:36 AMflytekit
to list projects, but it only gives me active projects, whereas with flytectl
, I’m getting all projects by default and can filter for active projects with --filter.fieldSelector "state=1"
from flytekit.clients.friendly import SynchronousFlyteClient
from flytekit.configuration import Config
from flytekit.models.filters import Filter
client = SynchronousFlyteClient(Config.auto().platform)
print(client.list_projects_paginated())
How can I get all projects with flytekit
?Visak
03/23/2023, 3:34 PMSabrina Lui
03/23/2023, 6:22 PMmap_task
task with a min_success_ratio
< 1.0 as input to another task. However, as the failed outputs are not filtered out of the results, flytekit is unable to convert them to inputs properly, resulting in errors like this and this. Since this happens in type engine code between tasks, even an intermediate task to filter out none values would fail. This blocks us from using map_task
effectively in our workflows.
Here is a min repro example: https://gist.github.com/sabrinalui/e5478b9557dcf370c84d5e57758b4c87
It seems min_success_ratio
doesn't work locally but I attached the output when running with inputs [true,true,false]
from our endpoint below.
What would the lift be to (optionally) filter out failed tasks from the map_task
output?Frank Shen
03/23/2023, 10:32 PMRahul Mehta
03/23/2023, 10:50 PM