https://flyte.org logo
#ask-the-community
Title
# ask-the-community
r

Rodrigo Baron

02/04/2022, 5:30 PM
I've deployed flyte using kubeadm along with spark-k8s-operator which run successful this example, did try run the example from flytesnacks and got some issues (logs in 🧵). Someone know about
Non-spark-on-k8s command provided
?
Copy code
++ id -u
+ myuid=0
++ id -g
+ mygid=0
+ set +e
++ getent passwd 0
+ uidentry=root:x:0:0:root:/root:/bin/bash
+ set -e
+ '[' -z root:x:0:0:root:/root:/bin/bash ']'
+ SPARK_CLASSPATH=':/opt/spark/jars/*'
+ env
+ grep SPARK_JAVA_OPT_
+ sort -t_ -k4 -n
+ sed 's/[^=]*=\(.*\)/\1/g'
+ readarray -t SPARK_EXECUTOR_JAVA_OPTS
+ '[' -n '' ']'
+ '[' '' == 2 ']'
+ '[' '' == 3 ']'
+ '[' -n '' ']'
+ '[' -z ']'
+ case "$1" in
+ echo 'Non-spark-on-k8s command provided, proceeding in pass-through mode...'
+ CMD=("$@")
Non-spark-on-k8s command provided, proceeding in pass-through mode...
+ exec /usr/bin/tini -s -- pyflyte-execute --inputs <s3://my-s3-bucket/metadata/propeller/flytesnacks-development-p5jty2p9e5/k8ssparkpysparkpihellospark/data/inputs.pb> --output-prefix <s3://my-s3-bucket/metadata/propeller/flytesnacks-development-p5jty2p9e5/k8ssparkpysparkpihellospark/data/0> --raw-output-data-prefix <s3://my-s3-bucket/ko/p5jty2p9e5-f3cpp6wq-0> --resolver flytekit.core.python_auto_container.default_task_resolver -- task-module k8s_spark.pyspark_pi task-name hello_spark
Welcome to Flyte! Version: 0.22.2
Attempting to run with flytekit.core.python_auto_container.default_task_resolver...
WARNING:root:No config file provided or invalid flyte config_file_path flytekit.config specified.
Using user directory /tmp/flyte/20220204_165445/sandbox/local_flytekit/a504ea1e3e3c62771228ed2d83186827
{"asctime": "2022-02-04 16:54:57,721", "name": "flytekit", "levelname": "DEBUG", "message": "Task returns unnamed native tuple <class 'float'>"}
DEBUG:flytekit:Task returns unnamed native tuple <class 'float'>
{"asctime": "2022-02-04 16:54:57,722", "name": "flytekit", "levelname": "DEBUG", "message": "Task returns unnamed native tuple <class 'int'>"}
DEBUG:flytekit:Task returns unnamed native tuple <class 'int'>
{"asctime": "2022-02-04 16:54:57,821", "name": "flytekit", "levelname": "DEBUG", "message": "Task returns unnamed native tuple <class 'float'>"}
DEBUG:flytekit:Task returns unnamed native tuple <class 'float'>
No images specified, will use the default image
Running native-typed task
INFO:root:Entering timed context: Copying (<s3://my-s3-bucket/metadata/propeller/flytesnacks-development-p5jty2p9e5/k8ssparkpysparkpihellospark/data/inputs.pb> -> /tmp/flyte7c4gkp2a/local_flytekit/inputs.pb)
INFO:root:Output of command '['aws', '--endpoint-url', '<http://192.168.0.222:30084>', 's3', 'cp', '<s3://my-s3-bucket/metadata/propeller/flytesnacks-development-p5jty2p9e5/k8ssparkpysparkpihellospark/data/inputs.pb>', '/tmp/flyte7c4gkp2a/local_flytekit/inputs.pb']':
b'Completed 22 Bytes/22 Bytes (248 Bytes/s) with 1 file(s) remaining\rdownload: <s3://my-s3-bucket/metadata/propeller/flytesnacks-development-p5jty2p9e5/k8ssparkpysparkpihellospark/data/inputs.pb> to ../tmp/flyte7c4gkp2a/local_flytekit/inputs.pb\n'

INFO:root:Exiting timed context: Copying (<s3://my-s3-bucket/metadata/propeller/flytesnacks-development-p5jty2p9e5/k8ssparkpysparkpihellospark/data/inputs.pb> -> /tmp/flyte7c4gkp2a/local_flytekit/inputs.pb) [Wall Time: 9.195532751000428s, Process Time: 0.0055688540000000675s]
22/02/04 16:56:07 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
INFO:py4j.java_gateway:Error while receiving.
Traceback (most recent call last):
  File "/opt/venv/lib/python3.8/site-packages/py4j/java_gateway.py", line 1207, in send_command
    raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty
ERROR:root:Exception while sending command.
Traceback (most recent call last):
  File "/opt/venv/lib/python3.8/site-packages/py4j/java_gateway.py", line 1207, in send_command
    raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/venv/lib/python3.8/site-packages/py4j/java_gateway.py", line 1033, in send_command
    response = connection.send_command(command)
  File "/opt/venv/lib/python3.8/site-packages/py4j/java_gateway.py", line 1211, in send_command
    raise Py4JNetworkError(
py4j.protocol.Py4JNetworkError: Error while receiving
ERROR:root:!! Begin System Error Captured by Flyte !!
ERROR:root:Traceback (most recent call last):

      File "/opt/venv/lib/python3.8/site-packages/flytekit/common/exceptions/scopes.py", line 165, in system_entry_point
        return wrapped(*args, **kwargs)
      File "/opt/venv/lib/python3.8/site-packages/flytekit/core/base_task.py", line 442, in dispatch_execute
        new_user_params = self.pre_execute(ctx.user_space_params)
      File "/opt/venv/lib/python3.8/site-packages/flytekitplugins/spark/task.py", line 122, in pre_execute
        self.sess = sess_builder.getOrCreate()
      File "/opt/venv/lib/python3.8/site-packages/pyspark/sql/session.py", line 228, in getOrCreate
        sc = SparkContext.getOrCreate(sparkConf)
      File "/opt/venv/lib/python3.8/site-packages/pyspark/context.py", line 384, in getOrCreate
        SparkContext(conf=conf or SparkConf())
      File "/opt/venv/lib/python3.8/site-packages/pyspark/context.py", line 146, in __init__
        self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,
      File "/opt/venv/lib/python3.8/site-packages/pyspark/context.py", line 209, in _do_init
        self._jsc = jsc or self._initialize_context(self._conf._jconf)
      File "/opt/venv/lib/python3.8/site-packages/pyspark/context.py", line 321, in _initialize_context
        return self._jvm.JavaSparkContext(jconf)
      File "/opt/venv/lib/python3.8/site-packages/py4j/java_gateway.py", line 1568, in __call__
        return_value = get_return_value(
      File "/opt/venv/lib/python3.8/site-packages/py4j/protocol.py", line 334, in get_return_value
        raise Py4JError(

Message:

    An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext

SYSTEM ERROR! Contact platform administrators.
ERROR:root:!! End Error Captured by Flyte !!
INFO:root:Entering timed context: Writing (/tmp/flyte7c4gkp2a/local_flytekit/engine_dir -> <s3://my-s3-bucket/metadata/propeller/flytesnacks-development-p5jty2p9e5/k8ssparkpysparkpihellospark/data/0>)
INFO:root:Output of command '['aws', '--endpoint-url', '<http://192.168.0.222:30084>', 's3', 'cp', '--recursive', '--acl', 'bucket-owner-full-control', '/tmp/flyte7c4gkp2a/local_flytekit/engine_dir', '<s3://my-s3-bucket/metadata/propeller/flytesnacks-development-p5jty2p9e5/k8ssparkpysparkpihellospark/data/0']'>:
b'Completed 1.7 KiB/1.7 KiB (291.7 KiB/s) with 1 file(s) remaining\rupload: ../tmp/flyte7c4gkp2a/local_flytekit/engine_dir/error.pb to <s3://my-s3-bucket/metadata/propeller/flytesnacks-development-p5jty2p9e5/k8ssparkpysparkpihellospark/data/0/error.pb>\n'

INFO:root:Exiting timed context: Writing (/tmp/flyte7c4gkp2a/local_flytekit/engine_dir -> <s3://my-s3-bucket/metadata/propeller/flytesnacks-development-p5jty2p9e5/k8ssparkpysparkpihellospark/data/0>) [Wall Time: 11.491473693000444s, Process Time: 0.009896259000000018s]
INFO:root:Engine folder written successfully to the output prefix <s3://my-s3-bucket/metadata/propeller/flytesnacks-development-p5jty2p9e5/k8ssparkpysparkpihellospark/data/0>
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:37755)
Traceback (most recent call last):
  File "/opt/venv/lib/python3.8/site-packages/py4j/java_gateway.py", line 977, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/venv/lib/python3.8/site-packages/py4j/java_gateway.py", line 1115, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:37755)
Traceback (most recent call last):
  File "/opt/venv/lib/python3.8/site-packages/py4j/java_gateway.py", line 977, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/venv/lib/python3.8/site-packages/py4j/java_gateway.py", line 1115, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:37755)
Traceback (most recent call last):
  File "/opt/venv/lib/python3.8/site-packages/py4j/java_gateway.py", line 977, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/venv/lib/python3.8/site-packages/py4j/java_gateway.py", line 1115, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:37755)
Traceback (most recent call last):
  File "/opt/venv/lib/python3.8/site-packages/py4j/java_gateway.py", line 977, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/venv/lib/python3.8/site-packages/py4j/java_gateway.py", line 1115, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
h

Haytham Abuelfutuh

02/04/2022, 5:48 PM
@Yee @Eduardo Apolinario (eapolinario) would you be able to help @Rodrigo Baron?
y

Yee

02/04/2022, 8:46 PM
hey @Rodrigo Baron when you get a chance, could you send us the pod specs for the spark operator you have running and the task pod that failed?
i’ll try that on my end as well and see what the differences are.
from there hopefully we can try to narrow it down
i’m not familiar with the setup prescribed by the py-pi yaml files (not that i’m terribly familiar with the helm chart we use either, but at least that’s one difference we can start to dig into)
r

Rodrigo Baron

02/04/2022, 9:03 PM
the pod is running the flyte task:
Copy code
apiVersion: v1
kind: Pod
metadata:
  annotations:
    <http://cluster-autoscaler.kubernetes.io/safe-to-evict|cluster-autoscaler.kubernetes.io/safe-to-evict>: "false"
    <http://cni.projectcalico.org/containerID|cni.projectcalico.org/containerID>: 501668ab8335b3661174264ea733d6a0589a733172a8ae3a388d8e99d434abfa
    <http://cni.projectcalico.org/podIP|cni.projectcalico.org/podIP>: 172.29.77.31/32
    <http://cni.projectcalico.org/podIPs|cni.projectcalico.org/podIPs>: 172.29.77.31/32
  creationTimestamp: "2022-02-04T21:01:39Z"
  labels:
    domain: development
    execution-id: byn0i5mak6
    interruptible: "false"
    node-id: k8ssparkpysparkpihellospark
    project: flytesnacks
    shard-key: "11"
    task-name: k8s-spark-pyspark-pi-hello-spark
    workflow-name: flytegen-k8s-spark-pyspark-pi-hello-spark
  name: byn0i5mak6-f3cpp6wq-0
  namespace: flytesnacks-development
  ownerReferences:
  - apiVersion: <http://flyte.lyft.com/v1alpha1|flyte.lyft.com/v1alpha1>
    blockOwnerDeletion: true
    controller: true
    kind: flyteworkflow
    name: byn0i5mak6
    uid: a0c7d530-8416-40de-9a70-11024bc7a7a6
  resourceVersion: "1081142"
  uid: 731da94e-2786-4d23-9366-ebe7a841f62e
spec:
  containers:
  - args:
    - pyflyte-execute
    - --inputs
    - <s3://my-s3-bucket/metadata/propeller/flytesnacks-development-byn0i5mak6/k8ssparkpysparkpihellospark/data/inputs.pb>
    - --output-prefix
    - <s3://my-s3-bucket/metadata/propeller/flytesnacks-development-byn0i5mak6/k8ssparkpysparkpihellospark/data/0>
    - --raw-output-data-prefix
    - <s3://my-s3-bucket/e3/byn0i5mak6-f3cpp6wq-0>
    - --resolver
    - flytekit.core.python_auto_container.default_task_resolver
    - --
    - task-module
    - k8s_spark.pyspark_pi
    - task-name
    - hello_spark
    env:
    - name: FLYTE_INTERNAL_IMAGE
      value: rodrigobaron/flyte:0.0.4
    - name: FLYTE_INTERNAL_EXECUTION_WORKFLOW
      value: flytesnacks:development:.flytegen.k8s_spark.pyspark_pi.hello_spark
    - name: FLYTE_INTERNAL_EXECUTION_ID
      value: byn0i5mak6
    - name: FLYTE_INTERNAL_EXECUTION_PROJECT
      value: flytesnacks
    - name: FLYTE_INTERNAL_EXECUTION_DOMAIN
      value: development
    - name: FLYTE_ATTEMPT_NUMBER
      value: "0"
    - name: FLYTE_INTERNAL_TASK_PROJECT
      value: flytesnacks
    - name: FLYTE_INTERNAL_TASK_DOMAIN
      value: development
    - name: FLYTE_INTERNAL_TASK_NAME
      value: k8s_spark.pyspark_pi.hello_spark
    - name: FLYTE_INTERNAL_TASK_VERSION
      value: v1
    - name: FLYTE_INTERNAL_PROJECT
      value: flytesnacks
    - name: FLYTE_INTERNAL_DOMAIN
      value: development
    - name: FLYTE_INTERNAL_NAME
      value: k8s_spark.pyspark_pi.hello_spark
    - name: FLYTE_INTERNAL_VERSION
      value: v1
    - name: FLYTE_AWS_ENDPOINT
      value: <http://192.168.0.222:30084>
    - name: FLYTE_AWS_ACCESS_KEY_ID
      value: minio
    - name: FLYTE_AWS_SECRET_ACCESS_KEY
      value: miniostorage
    image: rodrigobaron/flyte:0.0.4
    imagePullPolicy: IfNotPresent
    name: byn0i5mak6-f3cpp6wq-0
    resources:
      limits:
        cpu: 100m
        memory: 200Mi
      requests:
        cpu: 100m
        memory: 200Mi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: FallbackToLogsOnError
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-hg2dw
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  nodeName: k8s
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Never
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: <http://node.kubernetes.io/not-ready|node.kubernetes.io/not-ready>
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: <http://node.kubernetes.io/unreachable|node.kubernetes.io/unreachable>
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: kube-api-access-hg2dw
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2022-02-04T21:01:39Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2022-02-04T21:01:42Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2022-02-04T21:01:42Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2022-02-04T21:01:39Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: <docker://184942b452022d150ab107adbe2134be7621907b834cb0c25cc86754bb710d0>4
    image: rodrigobaron/flyte:0.0.4
    imageID: <docker-pullable://rodrigobaron/flyte@sha256:bf082fbb2bb7626956d2976cb418c8fe7344c82279e7b628e3d18697c988e0c3>
    lastState: {}
    name: byn0i5mak6-f3cpp6wq-0
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2022-02-04T21:01:41Z"
  hostIP: 192.168.0.222
  phase: Running
  podIP: 172.29.77.31
  podIPs:
  - ip: 172.29.77.31
  qosClass: Guaranteed
  startTime: "2022-02-04T21:01:39Z"
the pod is running the spark-operator:
Copy code
apiVersion: v1
kind: Pod
metadata:
  annotations:
    <http://cni.projectcalico.org/containerID|cni.projectcalico.org/containerID>: f4e801034ed42fe8954ef4a40ae13261719fc332ef14cd349e6ab039433fa365
    <http://cni.projectcalico.org/podIP|cni.projectcalico.org/podIP>: 172.29.77.32/32
    <http://cni.projectcalico.org/podIPs|cni.projectcalico.org/podIPs>: 172.29.77.32/32
    <http://prometheus.io/path|prometheus.io/path>: /metrics
    <http://prometheus.io/port|prometheus.io/port>: "10254"
    <http://prometheus.io/scrape|prometheus.io/scrape>: "true"
  creationTimestamp: "2022-02-04T14:17:10Z"
  generateName: flyte-sparkoperator-86bc9b4dc9-
  labels:
    <http://app.kubernetes.io/instance|app.kubernetes.io/instance>: flyte
    <http://app.kubernetes.io/name|app.kubernetes.io/name>: sparkoperator
    pod-template-hash: 86bc9b4dc9
  name: flyte-sparkoperator-86bc9b4dc9-npgxp
  namespace: default
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: flyte-sparkoperator-86bc9b4dc9
    uid: 71878b96-d5d4-473a-91bf-8b31488b3427
  resourceVersion: "997418"
  uid: 10d48af7-cb93-43d9-8be0-850e27ef167a
spec:
  containers:
  - args:
    - -v=2
    - -logtostderr
    - -namespace=
    - -ingress-url-format=
    - -controller-threads=10
    - -resync-interval=30
    - -enable-batch-scheduler=false
    - -enable-metrics=true
    - -metrics-labels=app_type
    - -metrics-port=10254
    - -metrics-endpoint=/metrics
    - -metrics-prefix=
    - -enable-resource-quota-enforcement=false
    image: <http://gcr.io/spark-operator/spark-operator:v1beta2-1.2.0-3.0.0|gcr.io/spark-operator/spark-operator:v1beta2-1.2.0-3.0.0>
    imagePullPolicy: IfNotPresent
    name: sparkoperator
    ports:
    - containerPort: 10254
      name: metrics
      protocol: TCP
    resources:
      limits:
        cpu: 100m
        memory: 300Mi
      requests:
        cpu: 100m
        memory: 300Mi
    securityContext: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-ppdjt
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  nodeName: k8s
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: flyte-sparkoperator
  serviceAccountName: flyte-sparkoperator
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: <http://node.kubernetes.io/not-ready|node.kubernetes.io/not-ready>
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: <http://node.kubernetes.io/unreachable|node.kubernetes.io/unreachable>
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: kube-api-access-ppdjt
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2022-02-04T14:17:10Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2022-02-04T14:17:14Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2022-02-04T14:17:14Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2022-02-04T14:17:10Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: <docker://6a7e8c352ffdc324c549815e6327c58f87b80828b72db9fb0980945609c0152>4
    image: <http://gcr.io/spark-operator/spark-operator:v1beta2-1.2.0-3.0.0|gcr.io/spark-operator/spark-operator:v1beta2-1.2.0-3.0.0>
    imageID: <docker-pullable://gcr.io/spark-operator/spark-operator@sha256:a8bb2e06fce6c3b140d952fd978a3044d55e34e7e8fb6f510e095549f90ee6d2>
    lastState: {}
    name: sparkoperator
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2022-02-04T14:17:14Z"
  hostIP: 192.168.0.222
  phase: Running
  podIP: 172.29.77.32
  podIPs:
  - ip: 172.29.77.32
  qosClass: Guaranteed
  startTime: "2022-02-04T14:17:10Z"
y

Yee

02/04/2022, 10:15 PM
and remind me what version of flytekit you have?
oh and also, any obvious logs in the spark operator pod?
like anything erroring etc
just looking at the operator pod, not really seeing any differences. the resources you have seem a bit low but that’s it.
r

Rodrigo Baron

02/05/2022, 11:39 AM
if increase the memory the task run successfully however don't have any new logs in the spark-operator pod about the spark job
the most obvious thing is this log:
Copy code
+ readarray -t SPARK_EXECUTOR_JAVA_OPTS
+ '[' -n '' ']'
+ '[' '' == 2 ']'
+ '[' '' == 3 ']'
+ '[' -n '' ']'
+ '[' -z ']'
+ case "$1" in
+ echo 'Non-spark-on-k8s command provided, proceeding in pass-through mode...'
i'm using the lastest version of flytekit: 0.26.1
hoo is
spark
missing in my conf
Copy code
enabled-plugins:
this is the fix .. thanks for your support 🙂
y

Yee

02/07/2022, 5:54 PM
sorry about this. glad you figured it out.
didn’t even consider this, definitely should have
yeah it was weird that you were able to get the pod spec… typically what happens is that the worker pods are killed immediately by the driver
👀 1
90 Views