Alex Pozimenko
12/02/2022, 12:38 AMfieldsV1:
f:status:
.:
f:applicationState:
.:
f:state:
f:driverInfo:
.:
f:podName:
f:webUIAddress:
f:webUIPort:
f:webUIServiceName:
f:executionAttempts:
f:executorState:
.:
f:a7hhmd24d6hgw776vkfv-n0-0-exec-1:
f:lastSubmissionAttemptTime:
f:sparkApplicationId:
f:submissionAttempts:
f:submissionID:
f:terminationTime:
Manager: spark-operator
Operation: Update
Time: 2022-12-01T02:11:05Z
Ketan (kumare3)
Alex Pozimenko
12/02/2022, 8:14 PMFGXQ2R3ISZJ2FC_N3_0_N5_0_UI_SVC_PORT_4040_TCP_PROTO=tcp
FMFX6ESKONI6FO_N3_0_N4_0_UI_SVC_PORT_4040_TCP_PROTO=tcp
FTIBWPHPXO32XE_N3_0_N2_0_UI_SVC_PORT_4040_TCP_PROTO=tcp
F6K4ORSNOMFTLY_N3_0_N2_0_UI_SVC_PORT_4040_TCP_ADDR=172.20.2.240
FRUWTUSAANBRRA_N3_0_N3_0_UI_SVC_SERVICE_HOST=172.20.171.76
FE3RGZTYAPR14K_N3_0_N2_0_UI_SVC_PORT_4040_TCP=<tcp://172.20.86.57:4040>
FK5DWJ1IOU6QWC_N3_0_N4_0_UI_SVC_SERVICE_PORT=4040
FGX66OT2O1U4OW_N3_0_N3_0_UI_SVC_PORT_4040_TCP_PROTO=tcp
Hmm the failed application should be cleared by sparkoperator. and flyte should clear the workflow and everything after the GC intervalfor some reason we had 4.5K of both completed and failed just hanging there. I had to delete them using kubectl and then restart spark operator, flyte admin and propeller to clear the state. What is odd, spark wouldn't recover until after I restarted admin and propeller. Idk if both were necessary though, as I restarted them at the same time
Ketan (kumare3)
David Espejo (he/him)
01/19/2023, 3:48 PMAlex Pozimenko
01/19/2023, 7:10 PM