GitHub
11/01/2023, 6:51 PMsandbox-bundled
was missing a healthcheck, and was incorrectly marked as ready before the listening port was even open. This causes CI to occasionally fail if using operations that depend on a ready sandbox state.
Check all the applicable boxes
☐ I updated the documentation accordingly.
☑︎ All new and existing tests passed.
☑︎ All commits are signed-off.
Screenshots
Note to reviewers
flyteorg/flyte
GitHub Actions: build-and-push-sandbox-bundled-image
✅ 19 other checks have passed
19/20 successful checksGitHub
11/01/2023, 7:36 PM<https://github.com/flyteorg/flyte/tree/master|master>
by jeevb
<https://github.com/flyteorg/flyte/commit/5b32a42746d666077df80b0c2266cf287e964e12|5b32a427>
- Tune sandbox readiness checks to ensure that sandbox is fully accessible when marked as ready (#4348)
flyteorg/flyteGitHub
11/01/2023, 7:49 PM<https://github.com/flyteorg/flytectl/tree/master|master>
by jeevb
<https://github.com/flyteorg/flytectl/commit/422a8e73e7c5e09b12d5318d7f9c123c706f9953|422a8e73>
- Misc cleanups to aesthetics (#441)
flyteorg/flytectlGitHub
11/01/2023, 8:11 PMGitHub
11/01/2023, 8:15 PM<https://github.com/flyteorg/flytectl/tree/master|master>
by eapolinario
<https://github.com/flyteorg/flytectl/commit/d2979452fef41b419429a6c8f6e99f6999aa0c08|d2979452>
- Increase sleep seconds when waiting for sandbox (#442)
flyteorg/flytectlGitHub
11/01/2023, 8:37 PMGitHub
11/01/2023, 8:37 PM<https://github.com/flyteorg/homebrew-tap/tree/main|main>
by flyte-bot
<https://github.com/flyteorg/homebrew-tap/commit/9a9a1c37dc567289141708ad8381e93060fd8b5d|9a9a1c37>
- Brew formula update for flytectl version v0.7.9
flyteorg/homebrew-tapGitHub
11/01/2023, 11:47 PM<https://github.com/flyteorg/flytectl/tree/master|master>
by eapolinario
<https://github.com/flyteorg/flytectl/commit/1350bfa10018ab4b14990ceefe88b1caf49aba9e|1350bfa1>
- #minor Updated Sandbox config, with automated data configuration (#440)
flyteorg/flytectlGitHub
11/02/2023, 12:07 AMGitHub
11/02/2023, 12:07 AM<https://github.com/flyteorg/homebrew-tap/tree/main|main>
by flyte-bot
<https://github.com/flyteorg/homebrew-tap/commit/cdfe41b3891b5cb7f77f6e175e735620364c3129|cdfe41b3>
- Brew formula update for flytectl version v0.8.0
flyteorg/homebrew-tapGitHub
11/02/2023, 6:58 AMuseOffloadedWorkflowClosure
flag but unfortunately it didn’t help resolve the etcd errors that we’ve been seeing from the FlytePropeller. We’ve restarted both the FlyteAdmin and the FlytePropeller deployments after updating the configs but that didn’t seem to stop the existing executions from throwing the etcd errors continuously. In the end we had to delete the affected executions one by one manually to stop them from being scheduled endlessly.
There are some related issues and fixes that we found from 2-3 years ago which do not seem to work anymore, so we’re wondering what might’ve changed since:
Related Issues
• #363
• #511
Related Fixes
• flyteorg/flytepropeller#240
In particular, it seems like this if condition is no longer working as expected:
flyte/flytepropeller/pkg/controller/workflowstore/passthrough.go
Line 99 in </flyteorg/flyte/commit/c6476cc178e576a4b7853d5d2adcb54081848974|c6476cc>
Expected behavior
Tasks should never end up in a state where they are retried endlessly and should be terminated with an error if there is an issue updating their state.
Additional context to reproduce
No response
Screenshots
Here are some images to illustrate the problem that I’ve described:
1. Affected workflow - this was first executed at 2.17 AM (SGT/HKT) on the 7th of October and failed at around 2.34 AM; it was however being restarted endlessly ever since
image▾
image▾
image▾
GitHub
11/02/2023, 7:18 AMGitHub
11/02/2023, 8:37 AMGitHub
11/02/2023, 10:04 AMGitHub
11/02/2023, 10:57 AMGitHub
11/02/2023, 11:37 AMGitHub
11/02/2023, 5:19 PM_InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = "missing project"
debug_error_string = "UNKNOWN:Error received from peer ipv4:127.0.0.1:30080 {created_time:"2023-11-02T05:53:05.142633-05:00", grpc_status:3,
grpc_message:"missing project"}"
There's no reference to look further into FlyteRemote Config. A manual search leads to the FlyteRemote page but even adding explicitly the config like this:
remote = FlyteRemote(config= Config.for_endpoint(endpoint="localhost:30088", config_file="$HOME/.flyte/config.yaml"),default_project="flytesnacks",default_domain="development")
leads to an error:
FileNotFoundError: [Errno 2] No such file or directory: '$HOME/.flyte/config.yaml'
I'm sure there's something really simple I'm missing but it's nowhere explained in the docs.
Are you sure this issue hasn't been raised already?
☑︎ Yes
Have you read the Code of Conduct?
☑︎ Yes
flyteorg/flyteGitHub
11/02/2023, 6:00 PMGitHub
11/02/2023, 6:07 PM<https://github.com/flyteorg/flyte/tree/master|master>
by eapolinario
<https://github.com/flyteorg/flyte/commit/5301da09de12c26008b321ad768680e33d014d96|5301da09>
- Chore: Ensure Stalebot doesn't close issues we've not yet triaged. (#4352)
flyteorg/flyteGitHub
11/02/2023, 6:13 PMGitHub
11/02/2023, 7:03 PM<https://github.com/flyteorg/flyte/tree/master|master>
by kumare3
<https://github.com/flyteorg/flyte/commit/a1d182b3690555b784764a59e1268f55b2a083b8|a1d182b3>
- Do not automatically close stale issues (#4353)
flyteorg/flyteGitHub
11/02/2023, 7:09 PMInitError: Failed to create the collection: Prompt dismissed..
KeyringLocked: Failed to unlock the collection!
Expected behavior
I'm not trying to use any authentication so I would expect it to ignore any exceptions about the system keyring. Alternatively we could add AuthType.NONE
so that all keyvault related code can be avoided.
Additional context to reproduce
On my system I can reproduce with:
from flytekit.configuration import Config
from flytekit.remote import FlyteRemote
config = Config.for_sandbox()
remote = FlyteRemote(config=config)
remote.client
This is SSHing from one linux system to another linux system. Both are used with a Desktop setup so they probably have the GNOME keyring installed.
I tried using all the different Auth types but they all failed for one reason or another (this is largely expected because we do not have any authentication configured on our flyte deployment).
Full stack trace:
stack_trace.txt
Screenshots
No response
Are you sure this issue hasn't been raised already?
☑︎ Yes
Have you read the Code of Conduct?
☑︎ Yes
flyteorg/flyteGitHub
11/02/2023, 8:15 PMflyte-binary
chart to configure Azure storage accounts. The stow version bump is needed in order to use Workload Identity.
This PR does not include an implementation of RemoteURLInterface, which would be required before using signed URLs in Azure. A potential fix is being discussed in slack and should not be difficult, but I'm not confident that support for signed URLs is required.
A storage key or workload identity needs to be configured for this to work end-to-end with task pods and azure blob storage via Stow. A workflow identity can be defined in the PodTemplate like this example, and the service can be deployed using a values.yaml similar to here. Should that configuration be documented in some other way?
flyteorg/flyte
GitHub Actions: build-and-push-sandbox-bundled-image
GitHub Actions: sandbox-bundled-functional-tests
✅ 13 other checks have passed
13/15 successful checksGitHub
11/02/2023, 9:23 PM1_2t9jVOLMAZuz4swwnV3Xng▾
GitHub
11/02/2023, 9:23 PMflytectl config init
is:
admin:
# For GRPC endpoints you might want to use dns:///flyte.myexample.com
endpoint: dns:///localhost:30081
authType: Pkce
insecure: true
logger:
show-source: true
level:
But the config produced by flytectl demo start
is:
admin:
# For GRPC endpoints you might want to use dns:///flyte.myexample.com
endpoint: localhost:30080
authType: Pkce
insecure: true
console:
endpoint: <http://localhost:30080>
logger:
show-source: true
level: 0
This may be confusing for new users who would expect the two configs to be consistent
Provide a possible output or UX example
The output of of flytectl config init
should be the same as config-sandbox.yaml
Are you sure this issue hasn't been raised already?
☑︎ Yes
Have you read the Code of Conduct?
☑︎ Yes
flyteorg/flyteGitHub
11/02/2023, 9:23 PMClusterCreating
state for tasks would be helpful for people to know what's going on underneath. It is also much easier to debug if an operator fails to create a new cluster. In addition, cluster creation time and current task running time can be measured.it
Goal: What should the final outcome look like, ideally?
Users will know Flyte is creating a cluster when the node is stared
Describe alternatives you've considered
NA
Propose: Link/Inline OR Additional context
No response
Are you sure this issue hasn't been raised already?
☑︎ Yes
Have you read the Code of Conduct?
☑︎ Yes
flyteorg/flyteGitHub
11/02/2023, 9:23 PM_if
in the condition
must be a promise, and can't use a native python type in the if statement.
@dynamic
def d1() -> bool:
a = t1()
return (
conditional("train_estimator")
.if_(a==True)
.then(t2())
.else_()
.then(t2()))
@dynamic
def d1(a: bool) -> bool:
return (
conditional("train_estimator")
.if_(a==True) # <- failed because the a isn't a promise
.then(t2())
.else_()
.then(t2()))
Goal: What should the final outcome look like, ideally?
Both promise and native python types should be supported in the conditions
Describe alternatives you've considered
Users must create a task to convert it to a promise and use it in the conditions.
@task
def convert_to_promise(a: bool) -> bool:
return a
Propose: Link/Inline OR Additional context
https://flyte-org.slack.com/archives/CREL4QVAQ/p1666110935856779
Are you sure this issue hasn't been raised already?
☑︎ Yes
Have you read the Code of Conduct?
☑︎ Yes
flyteorg/flyteGitHub
11/02/2023, 9:23 PM--config
cli flag is not respected by flytectl config init
. Instead, this config always writes to ~/.flyte/config.yaml
. The presence of this global flag and yet it being ignored is confusing.
(Relatedly, could flytectl suppress describing global flags that are meaningless to a specific subcommand in its help output?)
Provide a possible output or UX example
flytectl config init --config my_config.yaml
creates a file my_config.yaml
in current directory.
This feature can help make the flytectl config init
command more generally useful in scenarios where users may wish to store more than one config and reference appropriate configs in subsequent commands. It also happens that flytectl config init
requires user input if the default path already exists and this could be problematic in CI workflows.
Are you sure this issue hasn't been raised already?
☑︎ Yes
Have you read the Code of Conduct?
☑︎ Yes
flyteorg/flyteGitHub
11/02/2023, 9:23 PMnumpy
were deprecated in v1.20.0 and finally removed in v1.24.0
(check Deprecations in https://github.com/numpy/numpy/releases/tag/v1.20.0).
We should remove the mentions to those deprecated aliases from the FlyteSchemaTransformer.
What if we do not do this?
Users who install numpy>=1.24.0
will see this error in flytekit:
/tmp/tmp.pgqfAvNZOC/venv/lib/python3.10/site-packages/flytekit/types/schema/types.py:324: FutureWarning: In the future `np.bool` will be defined as the corresponding NumPy scalar. (This may have returned Python scalars in past versions.
_np.bool: SchemaType.SchemaColumn.SchemaColumnType.BOOLEAN, # type: ignore
Traceback (most recent call last):
File "/tmp/tmp.pgqfAvNZOC/venv/bin/pyflyte", line 5, in <module>
from flytekit.clis.sdk_in_container.pyflyte import main
File "/tmp/tmp.pgqfAvNZOC/venv/lib/python3.10/site-packages/flytekit/__init__.py", line 195, in <module>
from flytekit.types import directory, file, numpy, schema
File "/tmp/tmp.pgqfAvNZOC/venv/lib/python3.10/site-packages/flytekit/types/schema/__init__.py", line 1, in <module>
from .types import (
File "/tmp/tmp.pgqfAvNZOC/venv/lib/python3.10/site-packages/flytekit/types/schema/types.py", line 314, in <module>
class FlyteSchemaTransformer(TypeTransformer[FlyteSchema]):
File "/tmp/tmp.pgqfAvNZOC/venv/lib/python3.10/site-packages/flytekit/types/schema/types.py", line 324, in FlyteSchemaTransformer
_np.bool: SchemaType.SchemaColumn.SchemaColumnType.BOOLEAN, # type: ignore
File "/tmp/tmp.pgqfAvNZOC/venv/lib/python3.10/site-packages/numpy/__init__.py", line 284, in __getattr__
raise AttributeError("module {!r} has no attribute "
AttributeError: module 'numpy' has no attribute 'bool'. Did you mean: 'bool_'?
Related component(s)
flytekit
Are you sure this issue hasn't been raised already?
☑︎ Yes
Have you read the Code of Conduct?
☑︎ Yes
flyteorg/flyteGitHub
11/02/2023, 9:23 PM