hi guys, i’m having issues w flyte sandbox deploym...
# flyte-deployment
b
hi guys, i’m having issues w flyte sandbox deployment. i’ve tried deploying the demo sandbox first on my mac, and when that failed, on a dev ec2 instance. both are exhibiting the same issues: on the flyte console page, i no longer get the standard page with the flytesnacks`development`,
staging
,
production
options (see screenshots). on clicking
login
, i get
Not Found
. i’ve been running a bunch of POCs over the past few weeks on a local cluster, and this hasn’t been an issue.
tail’ed logs in the parent container
Copy code
E0317 05:57:50.613939      59 configmap.go:193] Couldn't get configMap kube-system/coredns: failed to sync configmap cache: timed out waiting for the condition
E0317 05:57:50.614012      59 nestedpendingoperations.go:335] Operation for "{volumeName:<http://kubernetes.io/configmap/a03a0b9b-1dfe-48da-9e41-e34652633335-config-volume|kubernetes.io/configmap/a03a0b9b-1dfe-48da-9e41-e34652633335-config-volume> podName:a03a0b9b-1dfe-48da-9e41-e34652633335 nodeName:}" failed. No retries permitted until 2023-03-17 05:57:51.113991707 +0000 UTC m=+33.777193887 (durationBeforeRetry 500ms). Error: MountVolume.SetUp failed for volume "config-volume" (UniqueName: "<http://kubernetes.io/configmap/a03a0b9b-1dfe-48da-9e41-e34652633335-config-volume|kubernetes.io/configmap/a03a0b9b-1dfe-48da-9e41-e34652633335-config-volume>") pod "coredns-b96499967-n7rw8" (UID: "a03a0b9b-1dfe-48da-9e41-e34652633335") : failed to sync configmap cache: timed out waiting for the condition
I0317 05:57:50.666398      59 request.go:601] Waited for 1.184759686s due to client-side throttling, not priority and fairness, request: PATCH:<https://127.0.0.1:6443/api/v1/namespaces/kube-system/pods/local-path-provisioner-7b7dc8d6f5-4jzfv/status>
I0317 05:57:50.969983      59 node_lifecycle_controller.go:1192] Controller detected that some Nodes are Ready. Exiting master disruption mode.
I0317 05:57:52.541998      59 reconciler.go:342] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kubernetes-dashboard-certs\" (UniqueName: \"<http://kubernetes.io/secret/c20e7c75-ff30-48c7-9dc8-a778c7adb611-kubernetes-dashboard-certs\|kubernetes.io/secret/c20e7c75-ff30-48c7-9dc8-a778c7adb611-kubernetes-dashboard-certs\>") pod \"flyte-sandbox-kubernetes-dashboard-6757db879c-cgj7d\" (UID: \"c20e7c75-ff30-48c7-9dc8-a778c7adb611\") " pod="flyte/flyte-sandbox-kubernetes-dashboard-6757db879c-cgj7d"
I0317 05:57:52.542062      59 reconciler.go:342] "operationExecutor.VerifyControllerAttachedVolume started for volume \"flyte-sandbox-minio-storage\" (UniqueName: \"<http://kubernetes.io/host-path/e5685a57-1387-4e7e-b8fc-4a0aa4249465-flyte-sandbox-minio-storage\|kubernetes.io/host-path/e5685a57-1387-4e7e-b8fc-4a0aa4249465-flyte-sandbox-minio-storage\>") pod \"flyte-sandbox-minio-645c8ddf7c-zzsnj\" (UID: \"e5685a57-1387-4e7e-b8fc-4a0aa4249465\") " pod="flyte/flyte-sandbox-minio-645c8ddf7c-zzsnj"
I0317 05:57:52.542147      59 reconciler.go:342] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-9fjw5\" (UniqueName: \"<http://kubernetes.io/projected/72166ae0-0f86-4e1b-8cae-ad1079386c98-kube-api-access-9fjw5\|kubernetes.io/projected/72166ae0-0f86-4e1b-8cae-ad1079386c98-kube-api-access-9fjw5\>") pod \"metrics-server-668d979685-9dvnv\" (UID: \"72166ae0-0f86-4e1b-8cae-ad1079386c98\") " pod="kube-system/metrics-server-668d979685-9dvnv"
I0317 05:57:52.542400      59 reconciler.go:342] "operationExecutor.VerifyControllerAttachedVolume started for volume \"tmp-volume\" (UniqueName: \"<http://kubernetes.io/empty-dir/c20e7c75-ff30-48c7-9dc8-a778c7adb611-tmp-volume\|kubernetes.io/empty-dir/c20e7c75-ff30-48c7-9dc8-a778c7adb611-tmp-volume\>") pod \"flyte-sandbox-kubernetes-dashboard-6757db879c-cgj7d\" (UID: \"c20e7c75-ff30-48c7-9dc8-a778c7adb611\") " pod="flyte/flyte-sandbox-kubernetes-dashboard-6757db879c-cgj7d"
I0317 05:57:52.542462      59 reconciler.go:342] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-n9xlg\" (UniqueName: \"<http://kubernetes.io/projected/e5685a57-1387-4e7e-b8fc-4a0aa4249465-kube-api-access-n9xlg\|kubernetes.io/projected/e5685a57-1387-4e7e-b8fc-4a0aa4249465-kube-api-access-n9xlg\>") pod \"flyte-sandbox-minio-645c8ddf7c-zzsnj\" (UID: \"e5685a57-1387-4e7e-b8fc-4a0aa4249465\") " pod="flyte/flyte-sandbox-minio-645c8ddf7c-zzsnj"
I0317 05:57:52.542596      59 reconciler.go:342] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-n22j8\" (UniqueName: \"<http://kubernetes.io/projected/c20e7c75-ff30-48c7-9dc8-a778c7adb611-kube-api-access-n22j8\|kubernetes.io/projected/c20e7c75-ff30-48c7-9dc8-a778c7adb611-kube-api-access-n22j8\>") pod \"flyte-sandbox-kubernetes-dashboard-6757db879c-cgj7d\" (UID: \"c20e7c75-ff30-48c7-9dc8-a778c7adb611\") " pod="flyte/flyte-sandbox-kubernetes-dashboard-6757db879c-cgj7d"
I0317 05:57:52.542704      59 reconciler.go:342] "operationExecutor.VerifyControllerAttachedVolume started for volume \"tmp-dir\" (UniqueName: \"<http://kubernetes.io/empty-dir/72166ae0-0f86-4e1b-8cae-ad1079386c98-tmp-dir\|kubernetes.io/empty-dir/72166ae0-0f86-4e1b-8cae-ad1079386c98-tmp-dir\>") pod \"metrics-server-668d979685-9dvnv\" (UID: \"72166ae0-0f86-4e1b-8cae-ad1079386c98\") " pod="kube-system/metrics-server-668d979685-9dvnv"
E0317 05:57:53.170755      59 event_broadcaster.go:253] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"flyte-sandbox-postgresql-0.174d1f5176bad15e", GenerateName:"", Namespace:"flyte", SelfLink:"", UID:"f77f1e93-6f61-4ae2-a3df-c5c8407fbc73", ResourceVersion:"564", Generation:0, CreationTimestamp:time.Date(2023, time.March, 17, 5, 57, 50, 0, time.Local), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ZZZ_DeprecatedClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry{v1.ManagedFieldsEntry{Manager:"k3s", Operation:"Update", APIVersion:"<http://events.k8s.io/v1|events.k8s.io/v1>", Time:time.Date(2023, time.March, 17, 5, 57, 50, 0, time.Local), FieldsType:"FieldsV1", FieldsV1:(*v1.FieldsV1)(0xc010b6e0c0), Subresource:""}}}, EventTime:time.Date(2023, time.March, 17, 5, 57, 50, 823827754, time.Local), Series:(*v1.EventSeries)(0xc006365b20), ReportingController:"default-scheduler", ReportingInstance:"default-scheduler-63438637467c", Action:"Scheduling", Reason:"FailedScheduling", Regarding:v1.ObjectReference{Kind:"Pod", Namespace:"flyte", Name:"flyte-sandbox-postgresql-0", UID:"73ecc8eb-18ef-4ee3-985a-86f935dbe63a", APIVersion:"v1", ResourceVersion:"531", FieldPath:""}, Related:(*v1.ObjectReference)(nil), Note:"0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.", Type:"Warning", DeprecatedSource:v1.EventSource{Component:"", Host:""}, DeprecatedFirstTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeprecatedLastTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeprecatedCount:0}': 'Event "flyte-sandbox-postgresql-0.174d1f5176bad15e" is invalid: [series.count: Invalid value: "": should be at least 2, eventTime: Invalid value: 2023-03-17 05:57:50.823827 +0000 UTC: field is immutable]' (will not retry!)
E0317 05:58:06.079882      59 resource_quota_controller.go:413] unable to retrieve the complete list of server APIs: <http://metrics.k8s.io/v1beta1|metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
W0317 05:58:06.527593      59 garbagecollector.go:747] failed to discover some groups: map[<http://metrics.k8s.io/v1beta1:the|metrics.k8s.io/v1beta1:the> server is currently unable to handle the request]
I0317 05:58:14.187196      59 topology_manager.go:200] "Topology Admit Handler"
I0317 05:58:14.381395      59 reconciler.go:342] "operationExecutor.VerifyControllerAttachedVolume started for volume \"flyte-sandbox-db-storage\" (UniqueName: \"<http://kubernetes.io/host-path/73ecc8eb-18ef-4ee3-985a-86f935dbe63a-flyte-sandbox-db-storage\|kubernetes.io/host-path/73ecc8eb-18ef-4ee3-985a-86f935dbe63a-flyte-sandbox-db-storage\>") pod \"flyte-sandbox-postgresql-0\" (UID: \"73ecc8eb-18ef-4ee3-985a-86f935dbe63a\") " pod="flyte/flyte-sandbox-postgresql-0"
I0317 05:58:14.381427      59 reconciler.go:342] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-rghms\" (UniqueName: \"<http://kubernetes.io/projected/73ecc8eb-18ef-4ee3-985a-86f935dbe63a-kube-api-access-rghms\|kubernetes.io/projected/73ecc8eb-18ef-4ee3-985a-86f935dbe63a-kube-api-access-rghms\>") pod \"flyte-sandbox-postgresql-0\" (UID: \"73ecc8eb-18ef-4ee3-985a-86f935dbe63a\") " pod="flyte/flyte-sandbox-postgresql-0"
I0317 05:58:29.845273      59 scope.go:110] "RemoveContainer" containerID="7faac34fa644246cdbba57411191fae1e37b9365a6212e65ec49e59ff6d4a36d"
I0317 05:58:34.334500      59 event.go:294] "Event occurred" object="flyte-sandbox-webhook" fieldPath="" kind="MutatingWebhookConfiguration" apiVersion="<http://admissionregistration.k8s.io/v1|admissionregistration.k8s.io/v1>" type="Warning" reason="OwnerRefInvalidNamespace" message="ownerRef [apps/v1/ReplicaSet, namespace: , name: flyte-sandbox-75c5d88454, uid: 185cb389-333f-4252-a82d-2a0300ee0c6a] does not exist in namespace \"\""
I0317 05:58:36.093738      59 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for <http://flyteworkflows.flyte.lyft.com|flyteworkflows.flyte.lyft.com>
I0317 05:58:36.093793      59 shared_informer.go:255] Waiting for caches to sync for resource quota
I0317 05:58:36.194168      59 shared_informer.go:262] Caches are synced for resource quota
I0317 05:58:36.547321      59 shared_informer.go:255] Waiting for caches to sync for garbage collector
I0317 05:58:36.547369      59 shared_informer.go:262] Caches are synced for garbage collector
I0317 05:58:41.788357      59 topology_manager.go:200] "Topology Admit Handler"
I0317 05:58:41.949892      59 reconciler.go:342] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-4mddt\" (UniqueName: \"<http://kubernetes.io/projected/b8c597dc-2a26-427e-954c-59b097e2e433-kube-api-access-4mddt\|kubernetes.io/projected/b8c597dc-2a26-427e-954c-59b097e2e433-kube-api-access-4mddt\>") pod \"py39-cacher\" (UID: \"b8c597dc-2a26-427e-954c-59b097e2e433\") " pod="default/py39-cacher"
I0317 05:59:18.165852      59 reconciler.go:201] "operationExecutor.UnmountVolume started for volume \"kube-api-access-4mddt\" (UniqueName: \"<http://kubernetes.io/projected/b8c597dc-2a26-427e-954c-59b097e2e433-kube-api-access-4mddt\|kubernetes.io/projected/b8c597dc-2a26-427e-954c-59b097e2e433-kube-api-access-4mddt\>") pod \"b8c597dc-2a26-427e-954c-59b097e2e433\" (UID: \"b8c597dc-2a26-427e-954c-59b097e2e433\") "
I0317 05:59:18.166966      59 operation_generator.go:863] UnmountVolume.TearDown succeeded for volume "<http://kubernetes.io/projected/b8c597dc-2a26-427e-954c-59b097e2e433-kube-api-access-4mddt|kubernetes.io/projected/b8c597dc-2a26-427e-954c-59b097e2e433-kube-api-access-4mddt>" (OuterVolumeSpecName: "kube-api-access-4mddt") pod "b8c597dc-2a26-427e-954c-59b097e2e433" (UID: "b8c597dc-2a26-427e-954c-59b097e2e433"). InnerVolumeSpecName "kube-api-access-4mddt". PluginName "<http://kubernetes.io/projected|kubernetes.io/projected>", VolumeGidValue ""
I0317 05:59:18.266413      59 reconciler.go:384] "Volume detached for volume \"kube-api-access-4mddt\" (UniqueName: \"<http://kubernetes.io/projected/b8c597dc-2a26-427e-954c-59b097e2e433-kube-api-access-4mddt\|kubernetes.io/projected/b8c597dc-2a26-427e-954c-59b097e2e433-kube-api-access-4mddt\>") on node \"63438637467c\" DevicePath \"\""
I0317 05:59:18.940441      59 pod_container_deletor.go:79] "Container not found in pod's containers" containerID="4b4ce53ce5be1b58135823754b6e3b93afaae44b8b683c7918ba27aae05d3a48"
W0317 06:02:38.683970      59 info.go:53] Couldn't collect info from any of the files in "/etc/machine-id,/var/lib/dbus/machine-id"
flytectl version:
Copy code
(flyte) ubuntu@ip-172-31-92-199:~$ flytectl version
{
  "App": "flytectl",
  "Build": "29da288",
  "Version": "0.6.34",
  "BuildTime": "2023-03-17 09:17:23.237655216 +0000 UTC m=+0.024939893"
}{
  "App": "controlPlane",
  "Build": "unknown",
  "Version": "unknown",
  "BuildTime": "2023-03-17 05:58:29.976870718 +0000 UTC m=+0.033316151"
sandbox cluster version
1.4.1
replication: i had a fresh install of flytectl on a new server, and ran
flytectl demo start
c
same or similar issue- i have the same screen appearing. I did a fresh install too. https://flyte-org.slack.com/archives/CP2HDHKE1/p1679045708684189?thread_ts=1678991494.797069&amp;cid=CP2HDHKE1
s
I think this is an issue with flyteconsole
>=1.4.8
I'm seeing the same issue after upgrading our staging cluster to the Flyte Release
1.4.1
Downgrading flyteconsole to
1.4.7
helps.
k
Cc @Eduardo Apolinario (eapolinario) / @Jason Porter
j
Hey @Sören Brunk - unfortunately yeah we had a regression regarding material-ui. But we have the fix out in https://github.com/flyteorg/flyteconsole/releases/tag/v1.5.1
cc: @Soham @Carina Ursu
s
Hey @Brian Tang we've been trying to replicate the issue with 1.4.8 and we did see an error that caught our eye but it wasn't this screen. Just curious what is your stack configuration (flyte config) like
e
@Brian Tang, @Sören Brunk, we just released Flyte 1.4.2 that contains a fix for this. Sorry for the regression, we're investing in testing+automation to ensure that this doesn't happen in the future.
b
1.4.2 looks good - thanks for the quick turnaround!
108 Views