Is there a way to invalidate auth token? We have 2 separate environment and use google auth. Current...
p

Pradithya Aria Pura

almost 4 years ago
Is there a way to invalidate auth token? We have 2 separate environment and use google auth. Currently, if users switch environment, they will have authentication issue
{"json":{"src":"viper.go:398"},"level":"debug","msg":"Config section [storage] updated. No update handler registered.","ts":"2022-01-05T11:44:53+08:00"}
{"json":{"src":"viper.go:398"},"level":"debug","msg":"Config section [root] updated. No update handler registered.","ts":"2022-01-05T11:44:53+08:00"}
{"json":{"src":"viper.go:400"},"level":"debug","msg":"Config section [admin] updated. Firing updated event.","ts":"2022-01-05T11:44:53+08:00"}
{"json":{"src":"auth_flow_orchestrator.go:37"},"level":"debug","msg":"got a response from the refresh grant for old expiry 2022-01-05 11:54:53.262787 +0800 +08 with new expiry 2022-01-05 11:54:53.262787 +0800 +08","ts":"2022-01-05T11:44:54+08:00"}
{"json":{"src":"client.go:54"},"level":"info","msg":"Initialized Admin client","ts":"2022-01-05T11:44:54+08:00"}
Launch plan plan_scorer_data_pipeline.workflows.launchplan.plan_scorer_pipeline_workflow_schedule failed to get updated due to rpc error: code = Unauthenticated desc = token parse error [JWT_VERIFICATION_FAILED] Could not retrieve id token from metadata, caused by: rpc error: code = Unauthenticated desc = Request unauthenticated with IDToken
Error: rpc error: code = Unauthenticated desc = token parse error [JWT_VERIFICATION_FAILED] Could not retrieve id token from metadata, caused by: rpc error: code = Unauthenticated desc = Request unauthenticated with IDToken
{"json":{"src":"main.go:13"},"level":"error","msg":"rpc error: code = Unauthenticated desc = token parse error [JWT_VERIFICATION_FAILED] Could not retrieve id token from metadata, caused by: rpc error: code = Unauthenticated desc = Request unauthenticated with IDToken","ts":"2022-01-05T11:44:54+08:00"}
Any workaround for this?
Hello, I was exploring on Kubernetes Spark job and i tried to implement it by following this <Docum...
c

Chandramoulee K V

almost 3 years ago
Hello, I was exploring on Kubernetes Spark job and i tried to implement it by following this Documentation . This is done in a EKS setup. I have created a custom docker image for spark as specified in the documentation, (only thing i did was i commented the following out in the docker file
# Copy the makefile targets to expose on the container. This makes it easier to register.
# Delete this after we update CI to not serialize inside the container
# COPY k8s_spark/sandbox.config /root
# Copy the actual code
# COPY k8s_spark/ /root/k8s_spark
# This tag is supplied by the build script and will be used to determine the version
# when registering tasks, workflows, and launch plans
# ARG tag
# ENV FLYTE_INTERNAL_IMAGE $tag
# Copy over the helper script that the SDK relies on
# RUN cp ${VENV}/bin/flytekit_venv /usr/local/bin/
# RUN chmod a+x /usr/local/bin/flytekit_venv
) I registered the sample pyspark workflow with the image and i am facing this issue:
failed
SYSTEM ERROR! Contact platform administrators.
When looking at the logs in aws i found that it was unable to load native-hadoop library warning could this be the cause of this issue any idea?
{"log":"22/11/24 07:03:54 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
","stream":"stderr","docker":{"container_id":"XXX"},"kubernetes":{"container_name":"YYY","namespace_name":"flytesnacks-development","pod_name":"ZZZ","pod_id":"AAA","namespace_id":"BBB","namespace_labels":{"kubernetes_io/metadata_name":"flytesnacks-development"}}}
Hi Community, is there any simple approach to verify the GRPC service of flyte admin works as expect...
x

Xuan Hu

almost 3 years ago
Hi Community, is there any simple approach to verify the GRPC service of flyte admin works as expected? I tried to deploy
flyte-core
helm chart on self-hosted kubernetes cluster but encounter certificate problem when trying to register a workflow remotely. The service is deployed with “Kubernetes Ingress Controller Fake Certificate” and all the ssl/tls related settings should be configured with default value of the template. I roughly looked through them, but did not find any obvious problem. BTW, the flyte console seems to work fine. When I try to
flytectl register
with client config
admin.insecure: false
(the default value by
flytectl config init
), it complains about
$ flytectl register files --project flytesnacks --domain development --archive flyte-package.tgz --version latest
 ------------------------------------------------------------------ -------- ----------------------------------------------------
| NAME                                                             | STATUS | ADDITIONAL INFO                                    |
 ------------------------------------------------------------------ -------- ----------------------------------------------------
| /tmp/register2617257857/0_flyte.workflows.example.say_hello_1.pb | Failed | Error registering file due to rpc error: code =    |
|                                                                  |        | Unavailable desc = connection error: desc =        |
|                                                                  |        | "transport: authentication handshake failed: x509: |
|                                                                  |        | "Kubernetes Ingress Controller Fake Certificate"   |
|                                                                  |        | certificate is not trusted"                        |
 ------------------------------------------------------------------ -------- ----------------------------------------------------
1 rows
Error: Connection Info: [Endpoint: dns:///flyte.XXX.com, InsecureConnection?: false, AuthMode: Pkce]: rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: x509: "Kubernetes Ingress Controller Fake Certificate" certificate is not trusted"
After changing the
insecure
config to
true
, the error message becomes
$ flytectl register files --project flytesnacks --domain development --archive flyte-package.tgz --version latest
 ------------------------------------------------------------------ -------- ----------------------------------------------------
| NAME                                                             | STATUS | ADDITIONAL INFO                                    |
 ------------------------------------------------------------------ -------- ----------------------------------------------------
| /tmp/register3222452968/0_flyte.workflows.example.say_hello_1.pb | Failed | Error registering file due to rpc error: code =    |
|                                                                  |        | Unavailable desc = connection closed before server |
|                                                                  |        | preface received                                   |
 ------------------------------------------------------------------ -------- ----------------------------------------------------
1 rows
Error: Connection Info: [Endpoint: dns:///flyte.XXX.com, InsecureConnection?: true, AuthMode: Pkce]: rpc error: code = Unavailable desc = connection closed before server preface received
Actually, I am not sure the problem is caused by inappropriate client config or server settings. So I suppose the first step is to check the GRPC service of flyte admin. Just let me know if you have any comments. Thanks in advance.
`flyte-deps-contour-envoy` pod is stuck in a pending state when I try to deploy the sandbox env to a...
m

Matt Dupree

almost 3 years ago
flyte-deps-contour-envoy
pod is stuck in a pending state when I try to deploy the sandbox env to a cloud kubernetes cluster w/ 4 nodes. I’m just following the docs. I see that this has come up before here and here, but neither of the suggested solutions make sense for me (viz., I don’t want to deploy on kind and I’m not running an nginx pod that would conflict with countour/envoy.) Could I get some help? Here’s the output of `k get pods -n flyte`:
○ → kubectl get pods -n flyte
NAME                                              READY   STATUS    RESTARTS   AGE
flyte-deps-contour-envoy-xp6x2                    0/2     Pending   0          51m
flyte-deps-contour-envoy-v2tnd                    0/2     Pending   0          51m
flyte-deps-contour-envoy-qjfp5                    0/2     Pending   0          51m
flyte-deps-contour-envoy-bz2xj                    0/2     Pending   0          51m
flyte-deps-kubernetes-dashboard-8b7d858b7-2gnk2   1/1     Running   0          51m
minio-7c99cbb7bd-bczp4                            1/1     Running   0          51m
postgres-7b7dd4b66-n2w8g                          1/1     Running   0          51m
flyte-deps-contour-contour-cd4d956d9-tz82c        1/1     Running   0          51m
syncresources-6fb7586cb-szrjx                     1/1     Running   0          49m
flytepropeller-585fb99968-7bc9c                   1/1     Running   0          49m
datacatalog-7875898bf8-zdd6n                      1/1     Running   0          49m
flyteconsole-5667f8f975-q5j7b                     1/1     Running   0          49m
flyte-pod-webhook-8669764d6-8xsjx                 1/1     Running   0          49m
flyteadmin-649d4df4b-sk9px                        1/1     Running   0          49m
flytescheduler-9bdf8bf84-frn9r                    1/1     Running   0          49m
And here’s the logs for one of the pending pods:
○ → kubectl describe pods flyte-deps-contour-envoy-xp6x2  -n flyte
Name:           flyte-deps-contour-envoy-xp6x2
Namespace:      flyte
Priority:       0
Node:           <none>
Labels:         <http://app.kubernetes.io/component=envoy|app.kubernetes.io/component=envoy>
                <http://app.kubernetes.io/instance=flyte-deps|app.kubernetes.io/instance=flyte-deps>
                <http://app.kubernetes.io/managed-by=Helm|app.kubernetes.io/managed-by=Helm>
                <http://app.kubernetes.io/name=contour|app.kubernetes.io/name=contour>
                controller-revision-hash=67bdb7bd55
                <http://helm.sh/chart=contour-7.10.1|helm.sh/chart=contour-7.10.1>
                pod-template-generation=1
Annotations:    <none>
Status:         Pending
IP:
IPs:            <none>
Controlled By:  DaemonSet/flyte-deps-contour-envoy
Init Containers:
  envoy-initconfig:
    Image:      <http://docker.io/bitnami/contour:1.20.1-debian-10-r53|docker.io/bitnami/contour:1.20.1-debian-10-r53>
    Port:       <none>
    Host Port:  <none>
    Command:
      contour
    Args:
      bootstrap
      /config/envoy.json
      --xds-address=flyte-deps-contour
      --xds-port=8001
      --resources-dir=/config/resources
      --envoy-cafile=/certs/ca.crt
      --envoy-cert-file=/certs/tls.crt
      --envoy-key-file=/certs/tls.key
    Limits:
      cpu:     100m
      memory:  100Mi
    Requests:
      cpu:     10m
      memory:  50Mi
    Environment:
      CONTOUR_NAMESPACE:  flyte (v1:metadata.namespace)
    Mounts:
      /admin from envoy-admin (rw)
      /certs from envoycert (ro)
      /config from envoy-config (rw)
Containers:
  shutdown-manager:
    Image:      <http://docker.io/bitnami/contour:1.20.1-debian-10-r53|docker.io/bitnami/contour:1.20.1-debian-10-r53>
    Port:       <none>
    Host Port:  <none>
    Command:
      contour
    Args:
      envoy
      shutdown-manager
    Liveness:     http-get http://:8090/healthz delay=120s timeout=5s period=20s #success=1 #failure=6
    Environment:  <none>
    Mounts:
      /admin from envoy-admin (rw)
  envoy:
    Image:       <http://docker.io/bitnami/envoy:1.21.1-debian-10-r55|docker.io/bitnami/envoy:1.21.1-debian-10-r55>
    Ports:       8080/TCP, 8443/TCP, 8002/TCP
    Host Ports:  80/TCP, 443/TCP, 0/TCP
    Command:
      envoy
    Args:
      -c
      /config/envoy.json
      --service-cluster $(CONTOUR_NAMESPACE)
      --service-node $(ENVOY_POD_NAME)
      --log-level info
    Limits:
      cpu:     100m
      memory:  100Mi
    Requests:
      cpu:      10m
      memory:   50Mi
    Liveness:   http-get http://:8002/ready delay=120s timeout=5s period=20s #success=1 #failure=6
    Readiness:  http-get http://:8002/ready delay=10s timeout=1s period=3s #success=1 #failure=3
    Environment:
      CONTOUR_NAMESPACE:  flyte (v1:metadata.namespace)
      ENVOY_POD_NAME:     flyte-deps-contour-envoy-xp6x2 (v1:metadata.name)
    Mounts:
      /admin from envoy-admin (rw)
      /certs from envoycert (rw)
      /config from envoy-config (rw)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  envoy-admin:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  envoy-config:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  envoycert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  envoycert
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     <http://node.kubernetes.io/disk-pressure:NoSchedule|node.kubernetes.io/disk-pressure:NoSchedule> op=Exists
                 <http://node.kubernetes.io/memory-pressure:NoSchedule|node.kubernetes.io/memory-pressure:NoSchedule> op=Exists
                 <http://node.kubernetes.io/not-ready:NoExecute|node.kubernetes.io/not-ready:NoExecute> op=Exists
                 <http://node.kubernetes.io/pid-pressure:NoSchedule|node.kubernetes.io/pid-pressure:NoSchedule> op=Exists
                 <http://node.kubernetes.io/unreachable:NoExecute|node.kubernetes.io/unreachable:NoExecute> op=Exists
                 <http://node.kubernetes.io/unschedulable:NoSchedule|node.kubernetes.io/unschedulable:NoSchedule> op=Exists
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  56m   default-scheduler  0/4 nodes are available: 1 node(s) didn't have free ports for the requested pod ports, 3 node(s) didn't match Pod's node affinity/selector.
  Warning  FailedScheduling  54m   default-scheduler  0/4 nodes are available: 1 node(s) didn't have free ports for the requested pod ports, 3 node(s) didn't match Pod's node affinity/selector.