David Espejo (he/him)
05/30/2023, 2:56 PM403
errors (described here)
3. They're using the native GKE ingress controller, including the <http://cloud.google.com/app-protocols|cloud.google.com/app-protocols>: '{"grpc":"HTTP2"}'
annotation. Not sure if that's enough for Ingress to route gRPC traffic
@Fabio Grätz I remember you're using Flyte on GCP, so I was wondering if you had ideas/recommendations from your experience
Thanks!Fabio Grätz
05/30/2023, 3:00 PMFabio Grätz
05/30/2023, 3:00 PMFabio Grätz
05/30/2023, 3:01 PMFabio Grätz
05/30/2023, 3:02 PMFabio Grätz
05/30/2023, 3:03 PMFabio Grätz
05/30/2023, 3:05 PMFabio Grätz
05/30/2023, 3:06 PMFabio Grätz
05/30/2023, 3:06 PMAriel Kaspit
05/30/2023, 3:07 PMAriel Kaspit
05/30/2023, 3:08 PMFabio Grätz
05/30/2023, 3:09 PMFabio Grätz
05/30/2023, 3:11 PMFabio Grätz
05/30/2023, 3:12 PMFabio Grätz
05/30/2023, 3:51 PMAriel Kaspit
05/31/2023, 4:41 PMseparateGrpcIngress
? I understand from the documentation that it is Required for certain ingress controllers like nginx.
2. After deploying nginx (which uses our own tls certificate, not self-signed with cert-manager), I’m getting 502 bad gateway errors accessing the console (flyte.my.domain)… In the flyteadmin
logs I see this authentication error: Failed to refresh tokens. Restarting login flow. Error: [TOKEN_REFRESH_FAILURE] Error refreshing token, caused by: oauth2: cannot fetch token: 400 Bad Request
(it worth mentioning I don’t see any errors in Okta, which we configured there the authorization server)
3. Regarding accessing within the CLI, while trying to use flytectl
I’m still getting the same authentication error: PermissionDenied desc = unexpected HTTP status code received from server: 403 (Forbidden); malformed header: missing HTTP content-type
- Did you experience this thing? BTW, I don’t understand from your answer if you’re using flytectl/pyflyte
?
Thanks in advanced! And sorry for all the questions, I appreciate your help!Ariel Kaspit
06/14/2023, 3:13 PMKevin Parasseril
07/14/2023, 8:25 PMFredrik Lyford
07/20/2023, 10:00 AMsvartalf
07/21/2023, 11:05 AMflytectl
problem which I can't solve for now; maybe someone will be able to help me with it?
$ HTTPS_PROXY=localhost:8080 flytectl --admin.insecureSkipVerify -d development -p flyteexamples get tasks
Error: Connection Info: [Endpoint: dns:///flyte.dexterenergyservices.com, InsecureConnection?: false, AuthMode: Pkce]: rpc error: code = Unavailable desc = unexpected HTTP status code received from server: 502 (Bad Gateway); transport: received unexpected content-type "text/html; charset=UTF-8"
I'm using mitmproxy to intercept the flytectl
traffic (hence the HTTPS_PROXY
env var and --admin.insecureSkipVerify
additions) and I can see that flytectl
issues a request to /flyteidl.service.AdminService/ListTasks
endpoint.
It results in HTTP 502 response from the GCP Load balancer claiming that "server encountered a temporary error".
This is where I am stuck for now - I do have flyteadmin
Service running and the corresponding flyteadmin
Pods also green and pass the healthchecks, but I can't figure out how to confirm if flyteadmin
is actually running and responding.
Best thing I've found so far is that if I request the /me
endpoint (which is sent to the HTTP interface of the flyteadmin:80
service), it responds with the following response:
{
"error": "unknown service flyteidl.service.IdentityService",
"code": 12,
"message": "unknown service flyteidl.service.IdentityService"
}
And that looks suspicious :) Maybe it's a well-known problem? I would appreciate any help with itKevin Parasseril
07/23/2023, 9:46 AMChris Green
07/31/2023, 5:41 PMAriel Kaspit
08/07/2023, 2:24 PMHaytham Amin
08/18/2023, 12:25 AM[1/1] currentAttempt done. Last Error: USER::load_distribution(additional_distribution, dest_dir) │
│ │
│ /usr/local/lib/python3.10/site-packages/flytekit/core/utils.py:295 in │
│ wrapper │
│ │
│ ❱ 295 │ │ │ │ return func(*args, **kwargs) │
│ │
│ /usr/local/lib/python3.10/site-packages/flytekit/tools/fast_registration.py: │
│ 113 in download_distribution │
│ │
│ ❱ 113 │ FlyteContextManager.current_context().file_access.get_data(additio │
│ │
│ /usr/local/lib/python3.10/site-packages/flytekit/core/data_persistence.py:30 │
│ 1 in get_data │
│ │
│ ❱ 301 │ │ │ raise FlyteAssertion( │
╰──────────────────────────────────────────────────────────────────────────────╯
FlyteAssertion: Failed to get data from
<gs://flyte-blob-storage/flytesnacks/development/MHEQCB3WRLH247GOTRY3CZQIWE======>
/fastea66e2416c583fdc6995e9e030c414bf.tar.gz to /root/ (recursive=False).
Original exception: 403, message='Forbidden',
url=URL('<https://storage.googleapis.com/download/storage/v1/b/flyte-blob-storage>
/o/flytesnacks%2Fdevelopment%2FMHEQCB3WRLH247GOTRY3CZQIWE======%2Ffastea66e2416c
583fdc6995e9e030c414bf.tar.gz?alt=media')
The https
link seems to include extra entries which results in 403 error
Mark Waylonis
09/08/2023, 4:35 PMAshika UMAGILIYA
09/11/2023, 8:04 AMconfiguration:
database:
username: flyte_user
password: {password}
host: 192.{db ip}
port: 5432
dbname: flyte
storage:
metadataContainer: flyte-poc-data
userDataContainer: flyte-poc-data
provider: gcs
providerConfig:
gcs:
project: "{our GCP project id}"
serviceAccount:
create: false
name: dev01-flyte-poc-sa
helm install executed without any errors. But I only see Services and Deployments in k8s. There are NO pods created. Any idea ?
helm install flyte-backend flyteorg/flyte-binary --namespace flyte --values gcp-values.yml
W0911 15:41:23.101181 33738 warnings.go:70] autopilot-default-resources-mutator:Autopilot updated Deployment flyte/flyte-backend-flyte-binary: defaulted unspecified resources for containers [wait-for-db, flyte] (see <http://g.co/gke/autopilot-defaults>)
NAME: flyte-backend
LAST DEPLOYED: Mon Sep 11 15:41:20 2023
NAMESPACE: flyte
STATUS: deployed
REVISION: 1
TEST SUITE: None
Ashika UMAGILIYA
09/13/2023, 1:02 AMfrom flytekit import task, workflow, Resources
@task(requests=Resources(cpu="1", gpu="1", mem="1Gi"),
container_image="<http://ghcr.io/flyteorg/flytecookbook:kfpytorch-latest|ghcr.io/flyteorg/flytecookbook:kfpytorch-latest>",)
def say_hello() -> str:
return "Hello, World!"
@workflow
def hello_world_wf_gpu() -> str:
res = say_hello()
return res
if __name__ == "__main__":
print(f"Running my_wf() {hello_world_wf_gpu()}")
When I run this workflow, I get the follow error . Any idea how to fix this ?