Hello,
I was exploring on Kubernetes Spark job and i tried to implement it by following this
Documentation .
This is done in a EKS setup.
I have created a custom docker image for spark as specified in the documentation, (only thing i did was i commented the following out in the docker file
# Copy the makefile targets to expose on the container. This makes it easier to register.
# Delete this after we update CI to not serialize inside the container
# COPY k8s_spark/sandbox.config /root
# Copy the actual code
# COPY k8s_spark/ /root/k8s_spark
# This tag is supplied by the build script and will be used to determine the version
# when registering tasks, workflows, and launch plans
# ARG tag
# ENV FLYTE_INTERNAL_IMAGE $tag
# Copy over the helper script that the SDK relies on
# RUN cp ${VENV}/bin/flytekit_venv /usr/local/bin/
# RUN chmod a+x /usr/local/bin/flytekit_venv
)
I registered the sample pyspark workflow with the image and i am facing this issue:
failed
SYSTEM ERROR! Contact platform administrators.
When looking at the logs in aws i found that it was unable to load native-hadoop library warning could this be the cause of this issue any idea?
{"log":"22/11/24 07:03:54 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
","stream":"stderr","docker":{"container_id":"XXX"},"kubernetes":{"container_name":"YYY","namespace_name":"flytesnacks-development","pod_name":"ZZZ","pod_id":"AAA","namespace_id":"BBB","namespace_labels":{"kubernetes_io/metadata_name":"flytesnacks-development"}}}