https://flyte.org logo
#ask-the-community
Title
# ask-the-community
f

Frank Shen

11/07/2023, 5:21 PM
Hello, Does anyone know or hear that databricks_plugin has been used by any organizations in their production environment? https://docs.flyte.org/en/latest/deployment/plugins/webapi/databricks.html#databricks-plugin
k

Ketan (kumare3)

11/07/2023, 5:49 PM
@Frank Shen we are also migrating to using Databricks agents, as they can be run locally and remote. Make it much simpler to update (as written in python) and offer similar power
f

Frank Shen

11/07/2023, 5:50 PM
@Ketan (kumare3), Thanks for the heads-up!
k

Ketan (kumare3)

11/07/2023, 5:50 PM
cc @L godlike (OSS contributor) has helped folks from Expedia also use the agent
we are working on making it all work locally etc
a little early, but eventually you should be able to migrate to the agent without many changes and super simplify your testing etc
k

Kevin Su

11/07/2023, 8:59 PM
HBO also uses in it the production
cc @L godlike could you share the databricks doc
f

Frank Shen

11/07/2023, 9:48 PM
Hi @Kevin Su, I represent HBO. Unfortunately, Evan Sadler only tried it in Dev before he left. And I don’t know if it’s working in Dev either.
One is for agent, one is for plugin
Do you need help to set up the databricks plugin?
f

Frank Shen

11/08/2023, 2:54 AM
Hi @L godlike, Yes.
l

L godlike

11/08/2023, 2:55 AM
I recommend you use the databricks_plugin Doc now, currently the agent version hasn't been merged.
This PR can help you figure out more details about how to setup
If you need more help, please list your problem, Kevin and I will try our best to help you.
f

Frank Shen

11/08/2023, 8:04 PM
Hi @L godlike, @Kevin Su Before I decide to migrate to databricks agent or plugin for my production workflow from the open source k8s spark operator implementation, may I confirm a few things from you? 1. is installing the k8s spark operator still needed (I don’t think so but just to double check)? https://docs.flyte.org/en/latest/deployment/plugins/k8s/index.html#install-the-kubernetes-operator 2. Is the following configurations still needed? https://docs.flyte.org/en/latest/deployment/plugins/k8s/index.html#specify-plugin-configuration Specifically, this:
Copy code
cluster_resource_manager:    <- Is this needed?
  enabled: true
  config:
    cluster_resources:
      refreshInterval: 5m
      templatePath: "/etc/flyte/clusterresource/templates"
      customData:
        - production:
            - projectQuotaCpu:
                value: "5"
            - projectQuotaMemory:
                value: "4000Mi"
        - staging:
            - projectQuotaCpu:
                value: "2"
            - projectQuotaMemory:
                value: "3000Mi"
        - development:
            - projectQuotaCpu:
                value: "4"
            - projectQuotaMemory:
                value: "3000Mi"
      refresh: 5m

  # -- Resource templates that should be applied
  templates:
    # -- Template for namespaces resources
    - key: aa_namespace
      value: |
        apiVersion: v1
        kind: Namespace
        metadata:
          name: {{ namespace }}
        spec:
          finalizers:
          - kubernetes

    - key: ab_project_resource_quota
      value: |
        apiVersion: v1
        kind: ResourceQuota
        metadata:
          name: project-quota
          namespace: {{ namespace }}
        spec:
          hard:
            limits.cpu: {{ projectQuotaCpu }}
            limits.memory: {{ projectQuotaMemory }}

    - key: ac_spark_role     <- Is this needed?
      value: |
        apiVersion: <http://rbac.authorization.k8s.io/v1beta1|rbac.authorization.k8s.io/v1beta1>
        kind: Role
        metadata:
          name: spark-role
          namespace: {{ namespace }}
        rules:
        - apiGroups: ["*"]
          resources:
          - pods
          verbs:
          - '*'
        - apiGroups: ["*"]
          resources:
          - services
          verbs:
          - '*'
        - apiGroups: ["*"]
          resources:
          - configmaps
          verbs:
          - '*'

    - key: ad_spark_service_account     <- Is this needed?
      value: |
        apiVersion: v1
        kind: ServiceAccount
        metadata:
          name: spark
          namespace: {{ namespace }}

    - key: ae_spark_role_binding     <- Is this needed?
      value: |
        apiVersion: <http://rbac.authorization.k8s.io/v1beta1|rbac.authorization.k8s.io/v1beta1>
        kind: RoleBinding
        metadata:
          name: spark-role-binding
          namespace: {{ namespace }}
        roleRef:
          apiGroup: <http://rbac.authorization.k8s.io|rbac.authorization.k8s.io>
          kind: Role
          name: spark-role
        subjects:
        - kind: ServiceAccount
          name: spark
          namespace: {{ namespace }}

sparkoperator:   <- Is this needed?
  enabled: true
  plugin_config:
    plugins:
      spark:
        # Edit the Spark configuration as you see fit
        spark-config-default:
          - spark.driver.cores: "1"
          - spark.hadoop.fs.s3a.aws.credentials.provider: "com.amazonaws.auth.DefaultAWSCredentialsProviderChain"
          - spark.kubernetes.allocation.batch.size: "50"
          - spark.hadoop.fs.s3a.acl.default: "BucketOwnerFullControl"
          - spark.hadoop.fs.s3n.impl: "org.apache.hadoop.fs.s3a.S3AFileSystem"
          - spark.hadoop.fs.AbstractFileSystem.s3n.impl: "org.apache.hadoop.fs.s3a.S3A"
          - spark.hadoop.fs.s3.impl: "org.apache.hadoop.fs.s3a.S3AFileSystem"
          - spark.hadoop.fs.AbstractFileSystem.s3.impl: "org.apache.hadoop.fs.s3a.S3A"
          - spark.hadoop.fs.s3a.impl: "org.apache.hadoop.fs.s3a.S3AFileSystem"
          - spark.hadoop.fs.AbstractFileSystem.s3a.impl: "org.apache.hadoop.fs.s3a.S3A"
          - spark.network.timeout: 600s
          - spark.executorEnv.KUBERNETES_REQUEST_TIMEOUT: 100000
          - spark.executor.heartbeatInterval: 60s
Also what is the equivalent databricks plugin configuration like (in the helm chart)? Could you point me to an example? 3) How is the databricks spark job logging integrated with Flyte UI? 4) When running a flyte spark task on the server, is the --service-account spark option still required?