Nicholas LoFaso
12/10/2021, 2:33 AM@dynamic
task from a map_task
and am seeing a strange error. Is there a limitation I should be aware of?
[1/1] currentAttempt done. Last Error: UNKNOWN::failed to read data from dataDir [<gs://mybucket/metadata/propeller/nickflyte-dev-fc4174c9bef1a4c2caa5/n0/data/0/0/outputs.pb>]. Error: path:<gs://mybucket/metadata/propeller/nickflyte-dev-fc4174c9bef1a4c2caa5/n0/data/0/0/outputs.pb>: not found
The attached code fails on our GCS cluster. Other map_tasks that don’t call a dynamic task work fineSharon Gong
12/13/2021, 2:30 PM[3/3] currentAttempt done. Last Error: SYSTEM::Traceback (most recent call last):
File "/home/jovyan/.local/lib/python3.7/site-packages/flytekit/common/exceptions/scopes.py", line 165, in system_entry_point
return wrapped(*args, **kwargs)
File "/home/jovyan/.local/lib/python3.7/site-packages/flytekit/core/base_task.py", line 514, in dispatch_execute
) from e
Message:
Failed to convert return value for var o0 for function onemodel.models.subscribers._flyte.subscribers_task with error <class 'AttributeError'>: '_TypedJasperSchema' object has no attribute '_remote_path'
Ben Konz
12/14/2021, 10:06 PMBranchNode
and WorkflowNode
in the local executor for tests? (https://github.com/flyteorg/flytekit-java/blob/3cf3ee4ac27a3e1bad231be2fa9a30b8afa[…]/src/main/java/org/flyte/localengine/ExecutionNodeCompiler.java)
It's difficult to write tests for our sub-workflows without support for these in the local executorZach Palchick
12/16/2021, 12:25 AMHaytham Abuelfutuh
Paul Dittamo
12/17/2021, 9:22 PMSören Brunk
12/20/2021, 5:19 PME1220 16:10:36.391892 1 workers.go:102] error syncing 'mandant1-development/f3359d6b5cd941830000': Workflow[] failed. ErrorRecordingError: failed to publish event, caused by: EventSinkError: Error sending event, caused by [rpc error: code = Unauthenticated desc = token parse error [JWT_VERIFICATION_FAILED] Could not retrieve id token from metadata, caused by: rpc error: code = Unauthenticated desc = Request unauthenticated with IDToken]
It leaves the system in a weird state because there's no hard failure. flytepropeller ist running after all so our monitoring does not trigger an alarm, but no workflow execution is happening. After a flytepropeller restart, it suddenly starts working again.
So after some digging I found out why this happens in our setup: flyte-secret-auth
is populated with .Values.secrets.adminOauthClientCredentials.clientSecret
during installation which is set to the placeholder foobar
in values.yaml
. We set the secret value dynamically though with a helm hook during installation because we need to fetch the real client-secret from Keycloak. That happens only after flytepropeller is deployed and it seems that flytepropeller does not reload the secret on changes. Since flyte-secret-auth
is managed by helm, this happens again on every helm upgrade
.
I see mainly two (non exclusive) ways to improve this behavior:
• Remove the default clientSecret
and only create flyte-secret-auth
via helm if the value is actually set. Only mount flyte-secret-auth
if external auth is enabled. That would cause flytepropeller to fail to start until flyte-secret-auth
is created by other means.
• Trigger a flytepropeller reload when flyte-secret-auth
changes.
Any thoughts on this? Happy to contribute here but I'd like to discuss the best way forward with you first.Jake Neyer
12/20/2021, 6:48 PMflytectl register files --project chariot-sdk-test --domain development --archive out.tar.gz --version v2
---------------------------------------------------------------------- --------- ------------------------------------------------------------
| NAME (4) | STATUS | ADDITIONAL INFO |
---------------------------------------------------------------------- --------- ------------------------------------------------------------
| /tmp/register024275035/0_pm.nb.new.ipynb_1.pb | Success | Successfully registered file |
---------------------------------------------------------------------- --------- ------------------------------------------------------------
| /tmp/register024275035/1_new.ipynb_1.pb | Success | Successfully registered file |
---------------------------------------------------------------------- --------- ------------------------------------------------------------
| /tmp/register024275035/2_workflows.new_workflow.nb_to_python_wf_2.pb | Failed | Error registering file due to rpc error: code = |
| | | Internal desc = failed to compile workflow for |
| | | [resource_type:WORKFLOW project:"chariot-sdk-test" |
| | | domain:"development" |
| | | name:"workflows.new_workflow.nb_to_python_wf" version:"v2" |
| | | ] with err entry not found |
---------------------------------------------------------------------- --------- ------------------------------------------------------------
| /tmp/register024275035/3_workflows.new_workflow.nb_to_python_wf_3.pb | Failed | Error registering file due to rpc error: code = |
| | | NotFound desc = entry not found |
---------------------------------------------------------------------- --------- ------------------------------------------------------------
Any ideas why that might happen?Jake Neyer
12/21/2021, 3:01 AMEndre Karlson
12/21/2021, 10:40 AM+-------------------------------------+---------+-----------+
| SERVICE | STATUS | NAMESPACE |
+-------------------------------------+---------+-----------+
| flyte-contour-contour-certgen-xggqx | Pending | flyte |
+-------------------------------------+---------+-----------+
^ it's just stuck thereZach Palchick
12/21/2021, 11:12 PM{{ .<http://inputs.my|inputs.my>__input__data_class.input_1 }}
)Nicholas LoFaso
12/22/2021, 3:57 PMAvshalom Manevich
12/23/2021, 8:34 AMAvshalom Manevich
12/23/2021, 4:17 PMKetan (kumare3)
Alessandro Liparoti
12/31/2021, 2:13 PMoutputWriter.Put(ctx, ioutils.NewInMemoryOutputReader(outputs, nil))
to store a outputs
structure to be visible in console. I do this in Status
method (in this interface) whenever the job is completed. However it seems the protobuf is not written (thus not visible). I followed the same approach for a core plugin and it worked. Am I doing something wrong here in the way I am using the webapi plugin?Ketan (kumare3)
Sonja Ericsson
01/07/2022, 4:55 PMMichael Cheng Jan Kao
01/08/2022, 5:21 AMNicholas LoFaso
01/10/2022, 3:43 PM@tasks
?
I’m configuring the root logger, but it seems like that code is only running at registration time not execution time? See thread for exampleSandra Youssef
Haytham Abuelfutuh
Ketan (kumare3)
Eugene Cha
01/13/2022, 6:48 AMJulien Bisconti
01/17/2022, 2:12 PM1 - (All workflows transition time / all workflows duration)
We use this PromQL query:
1 -
( flyte:propeller:all:workflow:completion_latency_unlabeled_ms_sum
+ flyte:propeller:all:node:transition_latency_unlabeled_ms_sum
+ flyte:propeller:all:node:queueing_latency_unlabeled_ms_sum
+ flyte:propeller:all:workflow:acceptance_latency_unlabeled_ms_sum)
/ (flyte:propeller:all:workflow:failure_duration_unlabeled_ms_sum + flyte:propeller:all:workflow:success_duration_unlabeled_ms_sum)
The numbers we get seem a bit off ( around 50%). Is the PromQL query correct? What do you recommend we use to measure that overhead time ?Quinn Romanek
01/19/2022, 5:31 PMMuhammad Daniyal
01/23/2022, 11:10 AMMuhammad Daniyal
01/23/2022, 11:12 AMChen Yuanyuan
01/24/2022, 2:57 AMharsh patel
01/24/2022, 3:34 AM