Hi, I am trying out flyte and I cannot get through...
# ask-the-community
j
Hi, I am trying out flyte and I cannot get through one error. Maybe I am doing something very poorly so sorry for that if that is the case. I am following the Getting started page and I wanted to try create local cluster and feed it with workflows. When I try to run it locally, it passes without any issue:
Copy code
(venv) jpeschel@kinnan:~/Workplace/flyte-demo$ pyflyte run flytedemo.py training_workflow --hyperparameters '{"C": 0.1}'
LogisticRegression(C=0.1, max_iter=3000)
But if I start new local cluster:
Copy code
(venv) jpeschel@kinnan:~/Workplace/flyte-demo$ ./bin/flytectl demo start
INFO[0000] [0] Couldn't find a config file []. Relying on env vars and pflags. 
๐Ÿง‘โ€๐Ÿญ Bootstrapping a brand new flyte cluster... ๐Ÿ”จ ๐Ÿ”ง
delete existing sandbox cluster [y/n]: 
y
๐Ÿ‹ Going to use Flyte v1.7.0 release with image <http://cr.flyte.org/flyteorg/flyte-sandbox-bundled:sha-1ae254f8683699b68ecddc89d775fc5d39cc3d84|cr.flyte.org/flyteorg/flyte-sandbox-bundled:sha-1ae254f8683699b68ecddc89d775fc5d39cc3d84> 
๐Ÿ‹ pulling docker image for release <http://cr.flyte.org/flyteorg/flyte-sandbox-bundled:sha-1ae254f8683699b68ecddc89d775fc5d39cc3d84|cr.flyte.org/flyteorg/flyte-sandbox-bundled:sha-1ae254f8683699b68ecddc89d775fc5d39cc3d84>
๐Ÿง‘โ€๐Ÿญ booting Flyte-sandbox container
Waiting for cluster to come up...
Waiting for cluster to come up...
Waiting for cluster to come up...
Waiting for cluster to come up...
Waiting for cluster to come up...
Waiting for cluster to come up...
context flyte-sandbox already exist. Overwriting it
context modified for "flyte-sandbox" and switched over to it.
+-----------------------------------+---------------+-----------+
|              SERVICE              |    STATUS     | NAMESPACE |
+-----------------------------------+---------------+-----------+
| k8s: This might take a little bit | Bootstrapping |           |
+-----------------------------------+---------------+-----------+
+-----------------------------------+---------------+-----------+
|              SERVICE              |    STATUS     | NAMESPACE |
+-----------------------------------+---------------+-----------+
I don't get the expected output from Getting started page and when I try to send the workflow on the cluster I get this error:
Copy code
(venv) jpeschel@kinnan:~/Workplace/flyte-demo$ pyflyte run --remote flytedemo.py training_workflow --hyperparameters '{"C": 0.1}'
Failed with Exception Code: SYSTEM:Unknown
RPC Failed, with Status: StatusCode.UNAVAILABLE
        details: failed to connect to all addresses; last error: UNAVAILABLE: ipv4:127.0.0.1:30080: Socket closed
        Debug string UNKNOWN:failed to connect to all addresses; last error: UNAVAILABLE: ipv4:127.0.0.1:30080: Socket closed {created_time:"2023-06-19T09:17:08.725217881+02:00", grpc_status:14}
I tried to check whether the problem is caused by closed ports but netstat showed that port is opened:
Copy code
(venv) jpeschel@kinnan:~/Workplace/flyte-demo$ sudo netstat -ntlp
[sudo] password for jpeschel: 
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:6443            0.0.0.0:*               LISTEN      101329/docker-proxy 
tcp        0      0 127.0.0.53:53           0.0.0.0:*               LISTEN      3430/systemd-resolv 
tcp        0      0 127.0.0.1:631           0.0.0.0:*               LISTEN      4241/cupsd          
tcp        0      0 0.0.0.0:30080           0.0.0.0:*               LISTEN      101277/docker-proxy 
tcp        0      0 0.0.0.0:30001           0.0.0.0:*               LISTEN      101302/docker-proxy 
tcp        0      0 0.0.0.0:30000           0.0.0.0:*               LISTEN      101316/docker-proxy 
tcp        0      0 0.0.0.0:30002           0.0.0.0:*               LISTEN      101290/docker-proxy 
tcp6       0      0 ::1:631                 :::*                    LISTEN      4241/cupsd          
tcp6       0      0 127.0.0.1:63342         :::*                    LISTEN      15418/java
As well as ping from nmap:
Copy code
(venv) jpeschel@kinnan:~/Workplace/flyte-demo$ nmap -p 30080 127.0.0.1
Starting Nmap 7.80 ( <https://nmap.org> ) at 2023-06-19 10:06 CEST
Nmap scan report for localhost (127.0.0.1)
Host is up (0.000080s latency).

PORT      STATE SERVICE
30080/tcp open  unknown
I am at the ubuntu 22.0.4.2 LTS, I have 11th Gen Intelยฎ Coreโ„ข i7-11850H @ 2.50GHz ร— 16 and 32GiB of memory, which should be more than sufficient for this demo. Is there something that I didn't do that is required?
s
Have you exported the flytectl config?
Copy code
export FLYTECTL_CONFIG=~/.flyte/config-sandbox.yaml
Please make sure to check if
config-sandbox.yaml
has the required configuration.
j
Yes I did, sorry I did not mentioned that. And I looked through the yaml and everything seems ok.
Copy code
admin:
  # For GRPC endpoints you might want to use dns:///flyte.myexample.com
  endpoint: localhost:30080
  authType: Pkce
  insecure: true
console:
  endpoint: <http://localhost:30080>
logger:
  show-source: true
  level: 0
s
Can you teardown your demo cluster and bring it up again?
Copy code
flytectl demo teardown
flytectl demo start
j
I called teardown, but got permission denied for k3s.yaml:
Copy code
(venv) jpeschel@kinnan:~/Workplace/flyte-demo$ ./bin/flytectl demo teardown
Config cleanup failed. Which Failed due to unlinkat /home/jpeschel/.flyte/k3s/k3s.yaml: permission denied
context removed for "flyte-sandbox".
๐Ÿงน ๐Ÿงน Sandbox cluster is removed successfully.
โ‡๏ธ Run the following command to unset sandbox environment variables for accessing flytectl
        unset FLYTECTL_CONFIG
And start complained about missing config file:
Copy code
(venv) jpeschel@kinnan:~/Workplace/flyte-demo$ ./bin/flytectl demo start
INFO[0000] [0] Couldn't find a config file []. Relying on env vars and pflags. 
๐Ÿง‘โ€๐Ÿญ Bootstrapping a brand new flyte cluster... ๐Ÿ”จ ๐Ÿ”ง
๐Ÿ‹ Going to use Flyte v1.7.0 release with image <http://cr.flyte.org/flyteorg/flyte-sandbox-bundled:sha-1ae254f8683699b68ecddc89d775fc5d39cc3d84|cr.flyte.org/flyteorg/flyte-sandbox-bundled:sha-1ae254f8683699b68ecddc89d775fc5d39cc3d84> 
๐Ÿ‹ pulling docker image for release <http://cr.flyte.org/flyteorg/flyte-sandbox-bundled:sha-1ae254f8683699b68ecddc89d775fc5d39cc3d84|cr.flyte.org/flyteorg/flyte-sandbox-bundled:sha-1ae254f8683699b68ecddc89d775fc5d39cc3d84>
๐Ÿง‘โ€๐Ÿญ booting Flyte-sandbox container
Waiting for cluster to come up...
Waiting for cluster to come up...
Waiting for cluster to come up...
Waiting for cluster to come up...
Waiting for cluster to come up...
context modified for "flyte-sandbox" and switched over to it.
Is that the config-sandbox.yaml file?
s
I think that's okay. Are you still seeing the same error?
j
Yes, no change there.
s
Can you run
kubectl get pods -n flyte
command and check if all the pods are running?
j
For some reason all pods are pending:
Copy code
(venv) jpeschel@kinnan:~/Workplace/flyte-demo$ kubectl get pods -n flyte
E0619 12:44:09.550062  280188 memcache.go:287] couldn't get resource list for <http://metrics.k8s.io/v1beta1|metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
E0619 12:44:09.552732  280188 memcache.go:121] couldn't get resource list for <http://metrics.k8s.io/v1beta1|metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
E0619 12:44:09.554045  280188 memcache.go:121] couldn't get resource list for <http://metrics.k8s.io/v1beta1|metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
E0619 12:44:09.555097  280188 memcache.go:121] couldn't get resource list for <http://metrics.k8s.io/v1beta1|metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
NAME                                                  READY   STATUS    RESTARTS   AGE
flyte-sandbox-postgresql-0                            0/1     Pending   0          29s
flyte-sandbox-proxy-d95874857-xv2xc                   0/1     Pending   0          29s
flyte-sandbox-76d484c4b9-p7s2q                        0/2     Pending   0          29s
flyte-sandbox-docker-registry-78fb6fd969-qfhq6        0/1     Pending   0          29s
flyte-sandbox-kubernetes-dashboard-6757db879c-slqst   0/1     Pending   0          29s
flyte-sandbox-minio-645c8ddf7c-x6zdn                  0/1     Pending   0          29s
s
Can you check the pod logs?
kubectl logs <pod-name> -n flyte
j
I have a little bit of issue with metrics server and I am not sure how it is relevant, but I am not getting much from logs:
Copy code
(venv) jpeschel@kinnan:~/Workplace/flyte-demo$ kubectl logs flyte-sandbox-postgresql-0 -n flyte
E0619 13:09:30.664663  305986 memcache.go:287] couldn't get resource list for <http://metrics.k8s.io/v1beta1|metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
E0619 13:09:30.670736  305986 memcache.go:121] couldn't get resource list for <http://metrics.k8s.io/v1beta1|metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
E0619 13:09:30.672176  305986 memcache.go:121] couldn't get resource list for <http://metrics.k8s.io/v1beta1|metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
Defaulted container "postgresql" out of: postgresql, init-chmod-data (init)
s
@jeev, any idea about the cause of this issue?
j
I am still trying to solve this, but I found out, that there is some problem with the installation of kubernetes on my computer as colleagues of mine managed to make it run
Thanks for the help ๐Ÿ™‚
j
if you are running on a mac, it probably is related to resources available to docker. how much CPUs and memory do you have assigned to your docker daemon? can confirm with โ€œkubectl describe pod <podname> -n flyteโ€ where <podname> is any of the pending pods.
291 Views