Hello, Does anyone know or hear that databricks_pl...
# ask-the-community
f
Hello, Does anyone know or hear that databricks_plugin has been used by any organizations in their production environment? https://docs.flyte.org/en/latest/deployment/plugins/webapi/databricks.html#databricks-plugin
k
@Frank Shen we are also migrating to using Databricks agents, as they can be run locally and remote. Make it much simpler to update (as written in python) and offer similar power
f
@Ketan (kumare3), Thanks for the heads-up!
k
cc @L godlike (OSS contributor) has helped folks from Expedia also use the agent
we are working on making it all work locally etc
a little early, but eventually you should be able to migrate to the agent without many changes and super simplify your testing etc
k
HBO also uses in it the production
cc @L godlike could you share the databricks doc
f
Hi @Kevin Su, I represent HBO. Unfortunately, Evan Sadler only tried it in Dev before he left. And I don’t know if it’s working in Dev either.
One is for agent, one is for plugin
Do you need help to set up the databricks plugin?
f
Hi @L godlike, Yes.
l
I recommend you use the databricks_plugin Doc now, currently the agent version hasn't been merged.
This PR can help you figure out more details about how to setup
If you need more help, please list your problem, Kevin and I will try our best to help you.
f
Hi @L godlike, @Kevin Su Before I decide to migrate to databricks agent or plugin for my production workflow from the open source k8s spark operator implementation, may I confirm a few things from you? 1. is installing the k8s spark operator still needed (I don’t think so but just to double check)? https://docs.flyte.org/en/latest/deployment/plugins/k8s/index.html#install-the-kubernetes-operator 2. Is the following configurations still needed? https://docs.flyte.org/en/latest/deployment/plugins/k8s/index.html#specify-plugin-configuration Specifically, this:
Copy code
cluster_resource_manager:    <- Is this needed?
  enabled: true
  config:
    cluster_resources:
      refreshInterval: 5m
      templatePath: "/etc/flyte/clusterresource/templates"
      customData:
        - production:
            - projectQuotaCpu:
                value: "5"
            - projectQuotaMemory:
                value: "4000Mi"
        - staging:
            - projectQuotaCpu:
                value: "2"
            - projectQuotaMemory:
                value: "3000Mi"
        - development:
            - projectQuotaCpu:
                value: "4"
            - projectQuotaMemory:
                value: "3000Mi"
      refresh: 5m

  # -- Resource templates that should be applied
  templates:
    # -- Template for namespaces resources
    - key: aa_namespace
      value: |
        apiVersion: v1
        kind: Namespace
        metadata:
          name: {{ namespace }}
        spec:
          finalizers:
          - kubernetes

    - key: ab_project_resource_quota
      value: |
        apiVersion: v1
        kind: ResourceQuota
        metadata:
          name: project-quota
          namespace: {{ namespace }}
        spec:
          hard:
            limits.cpu: {{ projectQuotaCpu }}
            limits.memory: {{ projectQuotaMemory }}

    - key: ac_spark_role     <- Is this needed?
      value: |
        apiVersion: <http://rbac.authorization.k8s.io/v1beta1|rbac.authorization.k8s.io/v1beta1>
        kind: Role
        metadata:
          name: spark-role
          namespace: {{ namespace }}
        rules:
        - apiGroups: ["*"]
          resources:
          - pods
          verbs:
          - '*'
        - apiGroups: ["*"]
          resources:
          - services
          verbs:
          - '*'
        - apiGroups: ["*"]
          resources:
          - configmaps
          verbs:
          - '*'

    - key: ad_spark_service_account     <- Is this needed?
      value: |
        apiVersion: v1
        kind: ServiceAccount
        metadata:
          name: spark
          namespace: {{ namespace }}

    - key: ae_spark_role_binding     <- Is this needed?
      value: |
        apiVersion: <http://rbac.authorization.k8s.io/v1beta1|rbac.authorization.k8s.io/v1beta1>
        kind: RoleBinding
        metadata:
          name: spark-role-binding
          namespace: {{ namespace }}
        roleRef:
          apiGroup: <http://rbac.authorization.k8s.io|rbac.authorization.k8s.io>
          kind: Role
          name: spark-role
        subjects:
        - kind: ServiceAccount
          name: spark
          namespace: {{ namespace }}

sparkoperator:   <- Is this needed?
  enabled: true
  plugin_config:
    plugins:
      spark:
        # Edit the Spark configuration as you see fit
        spark-config-default:
          - spark.driver.cores: "1"
          - spark.hadoop.fs.s3a.aws.credentials.provider: "com.amazonaws.auth.DefaultAWSCredentialsProviderChain"
          - spark.kubernetes.allocation.batch.size: "50"
          - spark.hadoop.fs.s3a.acl.default: "BucketOwnerFullControl"
          - spark.hadoop.fs.s3n.impl: "org.apache.hadoop.fs.s3a.S3AFileSystem"
          - spark.hadoop.fs.AbstractFileSystem.s3n.impl: "org.apache.hadoop.fs.s3a.S3A"
          - spark.hadoop.fs.s3.impl: "org.apache.hadoop.fs.s3a.S3AFileSystem"
          - spark.hadoop.fs.AbstractFileSystem.s3.impl: "org.apache.hadoop.fs.s3a.S3A"
          - spark.hadoop.fs.s3a.impl: "org.apache.hadoop.fs.s3a.S3AFileSystem"
          - spark.hadoop.fs.AbstractFileSystem.s3a.impl: "org.apache.hadoop.fs.s3a.S3A"
          - spark.network.timeout: 600s
          - spark.executorEnv.KUBERNETES_REQUEST_TIMEOUT: 100000
          - spark.executor.heartbeatInterval: 60s
Also what is the equivalent databricks plugin configuration like (in the helm chart)? Could you point me to an example? 3) How is the databricks spark job logging integrated with Flyte UI? 4) When running a flyte spark task on the server, is the --service-account spark option still required?
Hi @L godlike, @Kevin Su, Is the agent version of databricks available now?
k
yes, we just merge databricks agent pr.
flytekit also support submitting a databricks job in the local execution
so you can easily test it in the local execution
f
Great news. @Kevin Su, which one should I use agent or plugin?
k
Agent. we will deprecate the backend plugins. agents are well maintained right now.
check out the example in the PR description. https://github.com/flyteorg/flytekit/pull/1951
f
Nice. @Kevin Su, could you point me to the doc to install databricks through agent in the k8s backend?
k
yes, are you deploying flyte on EKS?
f
yes
through helm charts
k
okok
1. Enable agent here 2. Update plugin config. checkout here 3. Update the agent secret
f
Thanks
k