Samhita Alla
Samhita Alla
Jimmy Du
02/18/2023, 12:57 AMMax retries exceeded with url: /my-s3-bucket/flytesnacks/development... 'Connection to 10.110.44.3 timed out
I'm able to create a new project through flytectl and see it in the flyte console.
I've installed the flyte-binary helm chart with the following overrides:
configuration:
database:
port: 5432
dbname: flyteadmin
host: 10.99.211.252
storage:
metadataContainer: my-s3-bucket
userDataContainer: my-s3-bucket
provider: s3
providerConfig:
s3:
disableSSL: true
v2Signing: false
endpoint: <http://10.110.44.3:9000>
authType: accesskey
accessKey: minio
secretKey: miniostorage
plugins:
# All k8s plugins default configuration
k8s:
inject-finalizer: true
default-env-vars:
- AWS_METADATA_SERVICE_TIMEOUT: 5
- AWS_METADATA_SERVICE_NUM_ATTEMPTS: 20
- FLYTE_AWS_ENDPOINT: "<http://10.110.44.3:9000>"
- FLYTE_AWS_ACCESS_KEY_ID: minio
- FLYTE_AWS_SECRET_ACCESS_KEY: miniostorage
I'm able to connect the minio client CLI to localhost:9000 after portforwarding the minio service (from 9000->9000). I'm also able to successfully curl <http://10.110.44.3:9000>
from the flyte-binary pod using kubectl exec
.
Would folks happen to know what might be happening here or what I could do to move forward in this investigation?Mike Ossareh
02/20/2023, 7:08 PMk8s-array
as a named plugin, but I cannot find any docs on it. Anyone have an idea of what this is?Alex Papanicolaou
02/22/2023, 6:48 PMAleksander Lempinen
02/23/2023, 2:12 PMYee
Geoff Salmon
02/23/2023, 6:55 PMKetan (kumare3)
Ketan (kumare3)
João Lobo Guerra Neto
02/23/2023, 6:59 PMReda Oulbacha
02/24/2023, 10:04 AMJan Fiedler
03/01/2023, 9:30 PM{"json":{},"level":"warning","msg":"stow configuration section missing, defaulting to legacy s3/minio connection config","ts":"2023-03-01T21:25:00Z"}
{"json":{},"level":"fatal","msg":"caught panic: entries is empty [goroutine 1 [running]:\nruntime/debug.Stack()\n\t/usr/local/go/src/runtime/debug/stack.go:24
Can anyone point me in the right direction what be missing here?Yee
Alex Papanicolaou
03/05/2023, 5:51 AMJan Fiedler
03/07/2023, 12:52 PMflyte-user-roles
for accessing stuff in their account. What is currently happening when following the official documentation is that after exchanging secret, token and adjusting the cluster config on the control plane, the data-plane retrieves all the namespaces, quotas and Service Accounts from the control plane (which are created by the cluster_resource_manager
i guess). This leaves me with default Service Accounts in the data-plane for all the projects/domains where the flyte-user-role
of the control plane is annotated. Obviously i want the flyte-user-roles
of the data-planes in there, which are completely unused so far in my setup.
One way would be to just replace the default Service Account annotation with the correct flyte-user-role in the project/domains i need them. Is there a better or correct way of doing this?Harry
03/13/2023, 11:01 PMvalues.yaml
listed below and I can’t seem to figure out how to get the right configuration into the flyte admin config.
# Helm command
helm install flyteorg/flyte-binary \
--generate-name \
--kube-context=<context> \
--namespace flyte \
--values flyte-binary/flyte-binary-eks-values.yaml
# flyte-binary-eks-values.yaml
configuration:
database:
password:<RD Password>
host: <DB Host URI>
dbname: app
storage:
metadataContainer: <bucket>
userDataContainer: <bucket>
provider: s3
providerConfig:
s3:
region: "us-west-2"
authType: "iam"
logging:
level: 1
plugins:
cloudwatch:
enabled: true
templateUri: |-
<https://console.aws.amazon.com/cloudwatch/home?region=us-west-2#logEventViewer:group=/eks/opta-development/cluster;stream=var.log.containers.{{> .podName }}_{{ .namespace }}_{{ .containerName }}-{{ .containerId }}.log
inline:
plugins:
aws:
batch:
roleAnnotationKey: <Redacted>
region: us-west-2
tasks:
task-plugins:
enabled-plugins:
- container
- sidecar
- aws_array
default-for-task-types:
- container_array: aws_array
- aws-batch: aws_array
- container: container
serviceAccount:
create: true
annotations:
<http://eks.amazonaws.com/role-arn|eks.amazonaws.com/role-arn>: <Redacted>
# Where should this go?
configMaps:
adminServer:
flyteadmin:
roleNameKey: <Redacted>
queues:
executionQueues:
- dynamic: <JobQueueName>
attributes:
- default
workflowConfigs:
- tags:
- default
And when I try and run a workflow ith Batch tasks I get this error:
Workflow[flytesnacks:development:<http://workflows.example.wf|workflows.example.wf>] failed. RuntimeExecutionError: max number of system retry attempts [11/10] exhausted. Last known status message: failed at Node[n0]. RuntimeExecutionError: failed during plugin execution, caused by: failed to execute handle for plugin [aws_array]: [BadTaskSpecification] config[dynamic_queue] is missing
Thanks for reading this far 🙂 🙏Slackbot
03/16/2023, 9:28 AMChoenden Kyirong
03/16/2023, 5:41 PMflytectl demo start
on a VSI (virtual server instance) and be able to reach the UI from the public using the public ip of the server? For example: <public_ip>:30080/console
. I’ve tried this and the demo is running from within the server but i cant seem to reach it from my own browser- do i need to setup a reverse proxy or do something else?
I try to navigate to: <public_ip>:30080/console
and nothing ends up loading despite the demo/sandbox running properly. I also curled the localhost:30089/console from inside the server and I get the html response back.
Any sort of help or feedback would be greatly appreciated, thanks!! 🙏Brian Tang
03/17/2023, 9:11 AMstaging
, production
options (see screenshots). on clicking login
, i get Not Found
. i’ve been running a bunch of POCs over the past few weeks on a local cluster, and this hasn’t been an issue.Anindya Saha
03/17/2023, 10:27 PM"Error syncing pod, skipping" err="failed to \"StartContainer\" for \"f154ee96e6c1a4fed852-n0-0\" with ImagePullBackOff: \"Back-off pulling image \\\"<http://cr.flyte.org/flyteorg/flytekit:py3.10-1.2.11\\\|cr.flyte.org/flyteorg/flytekit:py3.10-1.2.11\\\>"\"" pod="flytesnacks-development/f154ee96e6c1a4fed852-n0-0" podUID=5952591c-fde8-47bd-a9c4-2885a01915a7
E0317 22:24:01.121900 71 pod_workers.go:951] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"f154ee96e6c1a4fed852-n0-0\" with ImagePullBackOff: \"Back-off pulling image \\\"<http://cr.flyte.org/flyteorg/flytekit:py3.10-1.2.11\\\|cr.flyte.org/flyteorg/flytekit:py3.10-1.2.11\\\>"\"" pod="flytesnacks-development/f154ee96e6c1a4fed852-n0-0" podUID=5952591c-fde8-47bd-a9c4-2885a01915a7
How do I resolve this ?
Posted in #ask-the-communityAnindya Saha
03/18/2023, 7:37 PMpyflyte run --remote whylogs_example wf
@task(container_image="<http://ghcr.io/flyteorg/flytecookbook:whylogs_examples-latest|ghcr.io/flyteorg/flytecookbook:whylogs_examples-latest>")
def get_reference_data() -> pd.DataFrame:
...
@task(container_image="<http://ghcr.io/flyteorg/flytecookbook:whylogs_examples-latest|ghcr.io/flyteorg/flytecookbook:whylogs_examples-latest>")
def get_target_data() -> pd.DataFrame:
...
@task(container_image="<http://ghcr.io/flyteorg/flytecookbook:whylogs_examples-latest|ghcr.io/flyteorg/flytecookbook:whylogs_examples-latest>")
def create_profile_view(df: pd.DataFrame) -> DatasetProfileView:
...
@task(container_image="<http://ghcr.io/flyteorg/flytecookbook:whylogs_examples-latest|ghcr.io/flyteorg/flytecookbook:whylogs_examples-latest>")
def constraints_report(profile_view: DatasetProfileView) -> bool:
...
However, the get_reference_data
and get_target_data
should not need whylogs. They just work with pandas and scikit-learn. We should be able to run those tasks with the @task(container_image="<http://ghcr.io/flyteorg/flytecookbook:core-latest|ghcr.io/flyteorg/flytecookbook:core-latest>")
image. I did try that but it fails, k8s logs say:
File "/opt/venv/lib/python3.8/site-packages/flytekit/core/python_auto_container.py", line 279, in load_task
task_module = importlib.import_module(name=task_module) # type: ignore
...
File "/root/whylogs_example.py", line 17, in <module>
import whylogs as why
ModuleNotFoundError: No module named 'whylogs'
Traceback (most recent call last):
Every task container is trying to parse the entire whyogs_example.py file and since fytecookbook:core-latest
does not have whylogs it is failing. What is the best design pattern or strategy to be followed in such cases ? How can I make it work remotely ?
I read the containerization/multi_images.html, that example has two methods svm_trainer
and svm_predictor
but both end up using the same image. All the examples I see in https://github.com/flyteorg/flytesnacks/tree/master/cookbook/case_studies/ml_training also have only one custom docker file.
Is there a production grade example workflow with tasks taking different images which are significantly different with each other ? Looking for a reference complex workflow that talks about these nuances on how to organize the pieces together with different custom images for each task.João Campos
03/22/2023, 5:25 PMMike Ossareh
03/27/2023, 5:15 PMflyte-binary
helm chart is there a way to specify which project's you'd like automatically defined? I'm looking for something like initialProjects
from the flyte-core
helm chartDavid Espejo (he/him)
03/27/2023, 5:34 PMJan Fiedler
04/11/2023, 9:11 AMc6g.xlarge
but i keep running into format exec
errors. Anything particular to watch out for? Thanks!R Bharath Kumar
04/17/2023, 2:04 PMAmadeusz Lisiecki
04/19/2023, 9:04 AMingress:
create: true
commonAnnotations:
<http://kubernetes.io/ingress.class|kubernetes.io/ingress.class>: internal
<http://nginx.ingress.kubernetes.io/force-ssl-redirect|nginx.ingress.kubernetes.io/force-ssl-redirect>: "true"
httpAnnotations:
<http://nginx.ingress.kubernetes.io/app-root|nginx.ingress.kubernetes.io/app-root>: /console
grpcAnnotations:
<http://nginx.ingress.kubernetes.io/backend-protocol|nginx.ingress.kubernetes.io/backend-protocol>: GRPC
host: <http://flyte-dev.internal.qa-knowledge-base-coin-dev.z-dn.net|flyte-dev.internal.qa-knowledge-base-coin-dev.z-dn.net>
Console works just fine but I get these errors from flytectl:
$ flytectl config init --host <http://flyte-dev.internal.qa-knowledge-base-coin-dev.z-dn.net|flyte-dev.internal.qa-knowledge-base-coin-dev.z-dn.net>
...
$ flytectl get project
Error: Connection Info: [Endpoint: dns:///flyte-dev.internal.qa-knowledge-base-coin-dev.z-dn.net, InsecureConnection?: false, AuthMode: Pkce]: rpc error: code = Unknown desc = unexpected HTTP status code received from server: 464 (); malformed header: missing HTTP content-type
OR
$ flytectl config init --host <http://flyte-dev.internal.qa-knowledge-base-coin-dev.z-dn.net|flyte-dev.internal.qa-knowledge-base-coin-dev.z-dn.net> --insecure
...
$ flytectl get project
Error: Connection Info: [Endpoint: dns:///flyte-dev.internal.qa-knowledge-base-coin-dev.z-dn.net, InsecureConnection?: true, AuthMode: Pkce]: rpc error: code = Unavailable desc = connection closed before server preface received
Any suggestions?Shivay Lamba
04/21/2023, 4:03 AMFrédéric Kaczynski
04/21/2023, 8:03 AM