Masa Nakamura
01/24/2023, 8:22 PMError: Get "<https://127.0.0.1:30086/api/v1/nodes>": dial tcp 127.0.0.1:30086: connect: connection refused
{"json":{},"level":"error","msg":"Get \"https://127.0.0.1:30086/api/v1/nodes\": dial tcp 127.0.0.130086 connect: connection refused","ts":"2023-01-24T122132-08:00"}
Am I missing something?
• Windows WSL2 with native filesystem.
• flytekit 1.3.1
• flytectl 0.6.26Rahul Mehta
01/24/2023, 10:12 PMPontus Wistbacka
01/25/2023, 9:02 AMBjörn
01/25/2023, 7:45 PMEvan Sadler
01/25/2023, 9:34 PMMasa Nakamura
01/25/2023, 11:44 PMverstack
so I included it to requirements.txt. Here is the outline what I did:
$ pyflyte init my_first_wf
$ ./bin/flytectl sandbox start --source .
Then I wrote some codes under workflows directory. I tested locally using python command and it worked without any problem, cuz verstack is inside venv library.
Now I built an image using bundled docker_build.sh like this:
$ sh docker_build.sh -r localhost:30000
$ pyflyte --pkgs <http://workflows.my|workflows.my>_workflow package --image localhost:30000/my_first_wf:fbc629e54b46c12d6396d39ea9d3f37307ba961d
At this moment, when I launch a container at localhost3000/my first wffbc629e54b46c12d6396d39ea9d3f37307ba961d, the python environment recognizes verstack.
$ docker run -it localhost:30000/my_first_wf:fbc629e54b46c12d6396d39ea9d3f37307ba961d sh
# python
>>> import verstack
(no problem)
However, when I try to run my workflow inside the demo cluster, it fails because python couldn't find verstack library.
pyflyte run --remote workflows/my_workflow.py train --filename '$VVIX.csv' --test_size 0.25 --output_path model.model
The import statement fails because it couldn't find verstack, and attached is a part of the log.
Pod failed. No message received from kubernetes.
[fd4d75a833c134b5f9e1-n0-0] terminated with exit code (1). Reason [Error]. Message:
l>", line 850, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "/root/workflows/my_workflow.py", line 11, in <module>
from verstack import LGBMTuner
ModuleNotFoundError: No module named 'verstack'
Am I successfully launching a pod that I made, or am I launching a different version of pod?Seung-Woo Lee
01/26/2023, 6:35 AMfailed at Node[n0]. RuntimeExecutionError: failed during plugin execution, caused by: failed to execute
│ handle for plugin [sidecar]: [BadTaskSpecification] invalid TaskSpecification [<nil>], Err: [nil Struct Object passed]
How can I handle it?Mina Fahimi
01/26/2023, 6:37 AMpyflyte -c flyte_cluster_config.yml run --remote .\workflows\example.py wf --name minaexample
but I get the following error
status = StatusCode.UNIMPLEMENTED
details = ""
debug_error_string = "UNKNOWN:Error received from peer ipv4:10.32.16.101:80 {created_time:"2023-01-26T06:45:08.745870339+00:00", grpc_status:12, grpc_message:""}"
any ideas what could be a problem?Eli Bixby
01/26/2023, 11:18 AMFlyteDirectory
inputs to a ContainerTask
? It seems like the download step doesn't correctly handle directories?
with inputs=dict(path=FlyteDirectory)
we're getting
Failed to download from ref [<gs://cradle-bio-pipelines/i9/f7aac4450113a425b905-n0-0/path>]
where that path does exist.Seth Baer
01/26/2023, 3:45 PMByron Hsu
01/26/2023, 5:38 PMChoenden Kyirong
01/26/2023, 6:14 PMgenerate_normal_df
, is getting stuck on running
and has been queued
for a while now. This was done using the local demo Flyte Cluster.
Running it locally (pyflyte run example.py wf --n 500 --mean 42 --sigma 2
) the workflow executes fine. However, running it on the Flyte Cluster (pyflyte run --remote example.py wf --n 500 --mean 42 --sigma 2
) doesn’t seem to work properly on my end. Any ideas?Adedeji Ayinde
01/26/2023, 6:42 PMTypeError: Failed to convert return value for var o0 for function ml_project_1.ohe.estimator with error <class 'pyarrow.lib.ArrowInvalid'>: ('Could not convert SparseVector(5, {2: 1.0}) with type SparseVector: did not recognize Python value type when inferring an Arrow data type', 'Conversion failed for column encoded_col_feat1 with type object')
Masa Nakamura
01/26/2023, 6:46 PMwhile getopts a:r:v:h flag
should be
while getopts p:r:v:h flag
I made some modification so I will send PR later.Sabrina Lui
01/26/2023, 11:29 PMflytectl
question: Is it possible to use flytectl create execution
to execute a workflow from an execution spec yaml (not relaunch or recover an existing run)? The docs only describe retrieving the execution spec of a task using get tasks --execFile
, which isn't a supported option for get workflow
.
I tried using get workflow-execution-config --attrFile
but this returns more of a workflow definition than execution spec.Sam Eckert
01/27/2023, 1:19 AMflyte_library
(taking cues from @Rahul Mehta’s talk!). On run, that rule
1. Creates a py3_image with the workflow file, as well as pulling in the aws
, flytectl
, and pyflyte-*
cli's. I wanted to keep things hermetic within the bazel env so I didn't create a base image with `awscli`/`pyflyte` pre-installed.
2. We add the FLYTE_INTERNAL_IMAGE
tag, and then push this image to ECR. I'm still not 100% sure what FLYTE_INTERNAL_IMAGE
does, but followed the examples I could find.
3. We then have a genrule which runs docker run
using the image we just created, and calls a custom register script which wraps pyflyte register
to register the workflow, and uses flytectl
to enable/optionally execute any launchplans registered alongside the workflow.
Registration works correctly as far as I can tell. The objects are created and viewable in the Console, but all tasks fail with:
[1/1] currentAttempt done. Last Error: UNKNOWN::Outputs not generated by task execution
I can see the pod starting, pulling the correct image, and the pyflyte-fast-execute
command exiting successfully via kubectl
. No logs are created before the script exits so I'm having a bit of trouble identifying the issue. Weirder still, the exact same pyflyte-fast-execute
command runs fine if I run it in a docker container locally.Rahul Mehta
01/27/2023, 2:05 AM{
"json": {
"exec_id": "run-name",
"ns": "flytetester-development",
"routine": "worker-11"
},
"level": "warning",
"msg": "Workflow not found in cache.",
"ts": "2023-01-26T17:37:00Z"
}
Are there any other locations to check/can I make the logging more verbose to see the cache keys that are being used for each execution?Ketan (kumare3)
Saravanan Arumugam
01/27/2023, 9:08 AMAndrew Korzhuev
01/27/2023, 10:08 AMFabio Grätz
01/28/2023, 8:37 AMSam Eckert
01/29/2023, 12:16 AM-d staging
are still being registered to the development
domain. I feel like I'm missing something basic. This workflow has been running in development
. I'm thinking maybe that workflows/launchplans can't be re-registered in a new domain?
CMD: /usr/bin/pyflyte register -p flytetester -d staging -i <image> atomwise/examples/flyte/
Some output:
Running pyflyte register from /app/atomwise/examples/flyte/canonicalize_smiles_py_image.binary.runfiles/__main__ with images ImageConfig(default_image=Image(name='default', fqn='<account>/flyte_canonicalize_smiles', tag='<tag>'), images=[Image(name='default', fqn='<account>/flyte_canonicalize_smiles', tag='<tag>')]) and image destination folder staging on 1 package(s) ('/app/atomwise/examples/flyte/canonicalize_smiles_py_image.binary.runfiles/__main__/atomwise/exmples/flyte',)
Leiqing
01/29/2023, 1:01 PMFile "/home/dev/conda_dev/devenv/Linux/envs/devenv-3.8-c/lib/python3.8/site-packages/flytekit/remote/remote.py", line 847, in execute
return self.execute_remote_task_lp(
File "/home/dev/conda_dev/devenv/Linux/envs/devenv-3.8-c/lib/python3.8/site-packages/flytekit/remote/remote.py", line 924, in execute_remote_task_lp
return self._execute(
File "/home/dev/conda_dev/devenv/Linux/envs/devenv-3.8-c/lib/python3.8/site-packages/flytekit/remote/remote.py", line 715, in _execute
type_hints[k] = TypeEngine.guess_python_type(input_flyte_type_map[k].type)
File "/home/dev/conda_dev/devenv/Linux/envs/devenv-3.8-c/lib/python3.8/site-packages/flytekit/core/type_engine.py", line 856, in guess_python_type
return transformer.guess_python_type(flyte_type)
File "/home/dev/conda_dev/devenv/Linux/envs/devenv-3.8-c/lib/python3.8/site-packages/flytekit/core/type_engine.py", line 1125, in guess_python_type
return typing.Union[tuple(TypeEngine.guess_python_type(v.type) for v in literal_type.union_type.variants)]
File "/home/dev/conda_dev/devenv/Linux/envs/devenv-3.8-c/lib/python3.8/site-packages/flytekit/core/type_engine.py", line 1125, in <genexpr>
return typing.Union[tuple(TypeEngine.guess_python_type(v.type) for v in literal_type.union_type.variants)]
AttributeError: 'LiteralType' object has no attribute 'type'
The fix seems to be changing v.type
to just v
Eli Bixby
01/30/2023, 11:59 AMContainerTask
with a toleration or affinity? It looks like I may have define a custom Task
that overrides the pod spec construction method right now?Andrew Korzhuev
01/30/2023, 2:14 PMflyte-core
through Helm on EKS when workflow_notifications
are enabled it produces ill-formatted yaml.
Steps to reproduce, `test-values.yaml`:
userSettings:
accountRegion: us-east-1
accountNumber: 123123123
notifications:
topicName: topic-name
queueName: queue-name
workflow_notifications:
enabled: true
config:
notifications:
type: "aws"
region: "{{ .Values.userSettings.accountRegion }}"
publisher:
topicName: "arn:aws:sns:{{ .Values.userSettings.accountRegion }}:{{ .Values.userSettings.accountNumber }}:{{ .Values.userSettings.notifications.topicName }}"
processor:
queueName: "{{ .Values.userSettings.notifications.queueName }}"
accountId: "{{ .Values.userSettings.accountNumber }}"
emailer:
subject: "Flyte: {{ project }}/{{ domain }}/{{ launch_plan.name }} has '{{ phase }}'"
sender: "{{ .Values.userSettings.notifications.sender }}"
body: |
"Execution {{ workflow.project }}/{{ workflow.domain }}/{{ workflow.name }}/{{ name }} has {{ phase }}.
Details: <https://flyte.example.com/console/projects/{{> project }}/domains/{{ domain }}/executions/{{ name }}.
{{ error }}"
Then run template:
helm template admin flyteorg/flyte-core --version v1.2.1 --values test-values.yaml | grep -A 4 "notifications.yaml"
Which outputs:
notifications.yaml: |
notifications:
type: aws
region:us-east-1
publisher:
By yaml spec the key must be separated with space, so region:us-east-1
breaks the service deployment and should be region: us-east-1
instead. Simple workaround is to put an extra space inside templated string region: " {{ .Values.userSettings.accountRegion }}"
.Vinícius Sosnowski
01/30/2023, 3:47 PMVictor Gustavo da Silva Oliveira
01/30/2023, 6:22 PMMasa Nakamura
01/30/2023, 7:22 PMflytekit/clis/sdk_in_container/init.py
, there is a call to https://github.com/flyteorg/flytekit-python-template.git Does anybody know if this repository is public or not? I guess it's public but hidden?? I would like make a PR for this repo.Sam Eckert
01/30/2023, 11:21 PM@task
def test(x: int) -> int:
return x + 1
@task
def echo_integer(value: int) -> int:
logging.error(f"I've got value {value}")
return test(x=value + 5) + 1
@workflow
def wf(
value: int,
):
return echo_integer(value=value)
When run locally, we see type-errors because the echo_integer
task now returns a promise instead of an int. Interestingly, it works fine when we run remotely, with the caveat that the called task is not run as a isolated container. I would have expected it to fail remotely as well. Any pointers about what I'm missing in my understanding?Frank Shen
01/30/2023, 11:58 PM