Nicholas LoFaso
12/10/2021, 2:33 AM@dynamic
task from a map_task
and am seeing a strange error. Is there a limitation I should be aware of?
[1/1] currentAttempt done. Last Error: UNKNOWN::failed to read data from dataDir [<gs://mybucket/metadata/propeller/nickflyte-dev-fc4174c9bef1a4c2caa5/n0/data/0/0/outputs.pb>]. Error: path:<gs://mybucket/metadata/propeller/nickflyte-dev-fc4174c9bef1a4c2caa5/n0/data/0/0/outputs.pb>: not found
The attached code fails on our GCS cluster. Other map_tasks that don’t call a dynamic task work fineSharon Gong
12/13/2021, 2:30 PM[3/3] currentAttempt done. Last Error: SYSTEM::Traceback (most recent call last):
File "/home/jovyan/.local/lib/python3.7/site-packages/flytekit/common/exceptions/scopes.py", line 165, in system_entry_point
return wrapped(*args, **kwargs)
File "/home/jovyan/.local/lib/python3.7/site-packages/flytekit/core/base_task.py", line 514, in dispatch_execute
) from e
Message:
Failed to convert return value for var o0 for function onemodel.models.subscribers._flyte.subscribers_task with error <class 'AttributeError'>: '_TypedJasperSchema' object has no attribute '_remote_path'
Ben Konz
12/14/2021, 10:06 PMBranchNode
and WorkflowNode
in the local executor for tests? (https://github.com/flyteorg/flytekit-java/blob/3cf3ee4ac27a3e1bad231be2fa9a30b8afa[…]/src/main/java/org/flyte/localengine/ExecutionNodeCompiler.java)
It's difficult to write tests for our sub-workflows without support for these in the local executorZach Palchick
12/16/2021, 12:25 AMHaytham Abuelfutuh
12/16/2021, 8:02 PMPaul Dittamo
12/17/2021, 9:22 PMSören Brunk
12/20/2021, 5:19 PME1220 16:10:36.391892 1 workers.go:102] error syncing 'mandant1-development/f3359d6b5cd941830000': Workflow[] failed. ErrorRecordingError: failed to publish event, caused by: EventSinkError: Error sending event, caused by [rpc error: code = Unauthenticated desc = token parse error [JWT_VERIFICATION_FAILED] Could not retrieve id token from metadata, caused by: rpc error: code = Unauthenticated desc = Request unauthenticated with IDToken]
It leaves the system in a weird state because there's no hard failure. flytepropeller ist running after all so our monitoring does not trigger an alarm, but no workflow execution is happening. After a flytepropeller restart, it suddenly starts working again.
So after some digging I found out why this happens in our setup: flyte-secret-auth
is populated with .Values.secrets.adminOauthClientCredentials.clientSecret
during installation which is set to the placeholder foobar
in values.yaml
. We set the secret value dynamically though with a helm hook during installation because we need to fetch the real client-secret from Keycloak. That happens only after flytepropeller is deployed and it seems that flytepropeller does not reload the secret on changes. Since flyte-secret-auth
is managed by helm, this happens again on every helm upgrade
.
I see mainly two (non exclusive) ways to improve this behavior:
• Remove the default clientSecret
and only create flyte-secret-auth
via helm if the value is actually set. Only mount flyte-secret-auth
if external auth is enabled. That would cause flytepropeller to fail to start until flyte-secret-auth
is created by other means.
• Trigger a flytepropeller reload when flyte-secret-auth
changes.
Any thoughts on this? Happy to contribute here but I'd like to discuss the best way forward with you first.Jake Neyer
12/20/2021, 6:48 PMflytectl register files --project chariot-sdk-test --domain development --archive out.tar.gz --version v2
---------------------------------------------------------------------- --------- ------------------------------------------------------------
| NAME (4) | STATUS | ADDITIONAL INFO |
---------------------------------------------------------------------- --------- ------------------------------------------------------------
| /tmp/register024275035/0_pm.nb.new.ipynb_1.pb | Success | Successfully registered file |
---------------------------------------------------------------------- --------- ------------------------------------------------------------
| /tmp/register024275035/1_new.ipynb_1.pb | Success | Successfully registered file |
---------------------------------------------------------------------- --------- ------------------------------------------------------------
| /tmp/register024275035/2_workflows.new_workflow.nb_to_python_wf_2.pb | Failed | Error registering file due to rpc error: code = |
| | | Internal desc = failed to compile workflow for |
| | | [resource_type:WORKFLOW project:"chariot-sdk-test" |
| | | domain:"development" |
| | | name:"workflows.new_workflow.nb_to_python_wf" version:"v2" |
| | | ] with err entry not found |
---------------------------------------------------------------------- --------- ------------------------------------------------------------
| /tmp/register024275035/3_workflows.new_workflow.nb_to_python_wf_3.pb | Failed | Error registering file due to rpc error: code = |
| | | NotFound desc = entry not found |
---------------------------------------------------------------------- --------- ------------------------------------------------------------
Any ideas why that might happen?Jake Neyer
12/21/2021, 3:01 AMEndre Karlson
12/21/2021, 10:40 AM+-------------------------------------+---------+-----------+
| SERVICE | STATUS | NAMESPACE |
+-------------------------------------+---------+-----------+
| flyte-contour-contour-certgen-xggqx | Pending | flyte |
+-------------------------------------+---------+-----------+
^ it's just stuck thereZach Palchick
12/21/2021, 11:12 PM{{ .<http://inputs.my|inputs.my>__input__data_class.input_1 }}
)Nicholas LoFaso
12/22/2021, 3:57 PMAvshalom Manevich
12/23/2021, 8:34 AMAvshalom Manevich
12/23/2021, 4:17 PMKetan (kumare3)
12/25/2021, 10:58 PMAlessandro Liparoti
12/31/2021, 2:13 PMoutputWriter.Put(ctx, ioutils.NewInMemoryOutputReader(outputs, nil))
to store a outputs
structure to be visible in console. I do this in Status
method (in this interface) whenever the job is completed. However it seems the protobuf is not written (thus not visible). I followed the same approach for a core plugin and it worked. Am I doing something wrong here in the way I am using the webapi plugin?Ketan (kumare3)
01/04/2022, 4:44 PMSonja Ericsson
01/07/2022, 4:55 PMMichael Cheng Jan Kao
01/08/2022, 5:21 AMNicholas LoFaso
01/10/2022, 3:43 PM@tasks
?
I’m configuring the root logger, but it seems like that code is only running at registration time not execution time? See thread for exampleSandra Youssef
01/10/2022, 9:53 PMHaytham Abuelfutuh
01/12/2022, 3:00 PMKetan (kumare3)
01/13/2022, 12:30 AMEugene Cha
01/13/2022, 6:48 AMJulien Bisconti
01/17/2022, 2:12 PM1 - (All workflows transition time / all workflows duration)
We use this PromQL query:
1 -
( flyte:propeller:all:workflow:completion_latency_unlabeled_ms_sum
+ flyte:propeller:all:node:transition_latency_unlabeled_ms_sum
+ flyte:propeller:all:node:queueing_latency_unlabeled_ms_sum
+ flyte:propeller:all:workflow:acceptance_latency_unlabeled_ms_sum)
/ (flyte:propeller:all:workflow:failure_duration_unlabeled_ms_sum + flyte:propeller:all:workflow:success_duration_unlabeled_ms_sum)
The numbers we get seem a bit off ( around 50%). Is the PromQL query correct? What do you recommend we use to measure that overhead time ?Quinn Romanek
01/19/2022, 5:31 PMMuhammad Daniyal
01/23/2022, 11:10 AMMuhammad Daniyal
01/23/2022, 11:12 AMChen Yuanyuan
01/24/2022, 2:57 AMharsh patel
01/24/2022, 3:34 AMharsh patel
01/24/2022, 3:34 AMKetan (kumare3)
01/24/2022, 3:51 AM