Hello! I got a sandbox up and running locally and ...
# ask-the-community
k
Hello! I got a sandbox up and running locally and have been trying out all the examples in the cookbook. I've gone through the Spark plugin setup here and have registered the latest version of the plugin example (0.3.66) but I've run into some issues with executing them as-is and the spark context can't be initialized. I re-ran through all the steps on the plugin install page again, but I'm wondering if something needs to be changed in the configuration, restarted, or if the issue is in the container from the registered workflows. Any guidance on how to best debug would be very much appreciated!
Copy code
Traceback (most recent call last):

      File "/opt/venv/lib/python3.8/site-packages/flytekit/exceptions/scopes.py", line 165, in system_entry_point
        return wrapped(*args, **kwargs)
      File "/opt/venv/lib/python3.8/site-packages/flytekit/core/base_task.py", line 464, in dispatch_execute
        new_user_params = self.pre_execute(ctx.user_space_params)
      File "/opt/venv/lib/python3.8/site-packages/flytekitplugins/spark/task.py", line 123, in pre_execute
        self.sess = sess_builder.getOrCreate()
      File "/opt/venv/lib/python3.8/site-packages/pyspark/sql/session.py", line 228, in getOrCreate
        sc = SparkContext.getOrCreate(sparkConf)
      File "/opt/venv/lib/python3.8/site-packages/pyspark/context.py", line 392, in getOrCreate
        SparkContext(conf=conf or SparkConf())
      File "/opt/venv/lib/python3.8/site-packages/pyspark/context.py", line 146, in __init__
        self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,
      File "/opt/venv/lib/python3.8/site-packages/pyspark/context.py", line 209, in _do_init
        self._jsc = jsc or self._initialize_context(self._conf._jconf)
      File "/opt/venv/lib/python3.8/site-packages/pyspark/context.py", line 329, in _initialize_context
        return self._jvm.JavaSparkContext(jconf)
      File "/opt/venv/lib/python3.8/site-packages/py4j/java_gateway.py", line 1585, in __call__
        return_value = get_return_value(
      File "/opt/venv/lib/python3.8/site-packages/py4j/protocol.py", line 334, in get_return_value
        raise Py4JError(

Message:

    An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext

SYSTEM ERROR! Contact platform administrators.
The full log from the pod:
Copy code
++ id -u
+ myuid=0
++ id -g
+ mygid=0
+ set +e
++ getent passwd 0
+ uidentry=root:x:0:0:root:/root:/bin/bash
+ set -e
+ '[' -z root:x:0:0:root:/root:/bin/bash ']'
+ SPARK_CLASSPATH=':/opt/spark/jars/*'
+ env
+ grep SPARK_JAVA_OPT_
+ sed 's/[^=]*=\(.*\)/\1/g'
+ sort -t_ -k4 -n
+ readarray -t SPARK_EXECUTOR_JAVA_OPTS
+ '[' -n '' ']'
+ '[' '' == 2 ']'
+ '[' '' == 3 ']'
+ '[' -n '' ']'
+ '[' -z ']'
+ case "$1" in
+ echo 'Non-spark-on-k8s command provided, proceeding in pass-through mode...'
+ CMD=("$@")
+ exec /usr/bin/tini -s -- pyflyte-execute --inputs <s3://my-s3-bucket/metadata/propeller/flytesnacks-development-nw0xm6ntwz/n0/data/inputs.pb> --output-prefix <s3://my-s3-bucket/metadata/propeller/flytesnacks-development-nw0xm6ntwz/n0/data/1> --raw-output-data-prefix <s3://my-s3-bucket/kb/nw0xm6ntwz-n0-1> --checkpoint-path <s3://my-s3-bucket/kb/nw0xm6ntwz-n0-1/_flytecheckpoints> --prev-checkpoint <s3://my-s3-bucket/vc/nw0xm6ntwz-n0-0/_flytecheckpoints> --resolver flytekit.core.python_auto_container.default_task_resolver -- task-module k8s_spark.pyspark_pi task-name hello_spark
Non-spark-on-k8s command provided, proceeding in pass-through mode...
22/04/14 15:19:55 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
ERROR:root:Exception while sending command.
Traceback (most recent call last):
  File "/opt/venv/lib/python3.8/site-packages/py4j/clientserver.py", line 480, in send_command
    raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/opt/venv/lib/python3.8/site-packages/py4j/java_gateway.py", line 1038, in send_command
    response = connection.send_command(command)
  File "/opt/venv/lib/python3.8/site-packages/py4j/clientserver.py", line 503, in send_command
    raise Py4JNetworkError(
py4j.protocol.Py4JNetworkError: Error while sending or receiving
{"asctime": "2022-04-14 15:20:06,729", "name": "flytekit.entrypoint", "levelname": "ERROR", "message": "!! Begin System Error Captured by Flyte !!"}
{"asctime": "2022-04-14 15:20:06,729", "name": "flytekit.entrypoint", "levelname": "ERROR", "message": "Traceback (most recent call last):\n\n      File \"/opt/venv/lib/python3.8/site-packages/flytekit/exceptions/scopes.py\", line 165, in system_entry_point\n        return wrapped(*args, **kwargs)\n      File \"/opt/venv/lib/python3.8/site-packages/flytekit/core/base_task.py\", line 464, in dispatch_execute\n        new_user_params = self.pre_execute(ctx.user_space_params)\n      File \"/opt/venv/lib/python3.8/site-packages/flytekitplugins/spark/task.py\", line 123, in pre_execute\n        self.sess = sess_builder.getOrCreate()\n      File \"/opt/venv/lib/python3.8/site-packages/pyspark/sql/session.py\", line 228, in getOrCreate\n        sc = SparkContext.getOrCreate(sparkConf)\n      File \"/opt/venv/lib/python3.8/site-packages/pyspark/context.py\", line 392, in getOrCreate\n        SparkContext(conf=conf or SparkConf())\n      File \"/opt/venv/lib/python3.8/site-packages/pyspark/context.py\", line 146, in __init__\n        self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,\n      File \"/opt/venv/lib/python3.8/site-packages/pyspark/context.py\", line 209, in _do_init\n        self._jsc = jsc or self._initialize_context(self._conf._jconf)\n      File \"/opt/venv/lib/python3.8/site-packages/pyspark/context.py\", line 329, in _initialize_context\n        return self._jvm.JavaSparkContext(jconf)\n      File \"/opt/venv/lib/python3.8/site-packages/py4j/java_gateway.py\", line 1585, in __call__\n        return_value = get_return_value(\n      File \"/opt/venv/lib/python3.8/site-packages/py4j/protocol.py\", line 334, in get_return_value\n        raise Py4JError(\n\nMessage:\n\n    An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext\n\nSYSTEM ERROR! Contact platform administrators."}
{"asctime": "2022-04-14 15:20:06,730", "name": "flytekit.entrypoint", "levelname": "ERROR", "message": "!! End Error Captured by Flyte !!"}
s
Hello @Katrina P! What’s the pyspark version you’re using?
k
Hello! I'm following the k8s operator set up in the guide and using the packaged release that was referenced in the guide. Looking at the code for that in https://github.com/flyteorg/flytesnacks/blob/v0.3.66/cookbook/integrations/kubernetes/k8s_spark/requirements.txt it looks like it would be 3.2.1
s
Can you try with pyspark 3.0.1?
k
Sure, I'll try that. I can pull it into my workspace and change that, to reproduce what I was seeing, I was just using the packaged release via
flytectl register files --config ~/.flyte/config.yaml <https://github.com/flyteorg/flytesnacks/releases/download/v0.3.66/snacks-cookbook-integrations-kubernetes-k8s_spark.tar.gz> --archive -p flytesnacks -d development --version latest
s
Gotcha. I’ve created a PR to update the spark cluster version: https://github.com/flyteorg/flytekit/pull/954. pyspark 3.0.1 should unblock you for now. Sorry for the confusion.
k
no problem! thanks so much, I'm trying building it locally out now
btw, if you update the spark version, would the checksum validation on line 26 fail?
k
cc @Yee
395 Views