Team I am running below simple snowflake query, ho...
# ask-the-community
s
Team I am running below simple snowflake query, however it take more than hours to execute. i could see Attempt 01 queued for long and then execute,
Copy code
from flytekit import kwtypes, workflow
from flytekitplugins.snowflake import SnowflakeConfig, SnowflakeTask


snowflake_task_no_io = SnowflakeTask(
    name="sql.snowflake.no_io",
    inputs={},
    query_template="select * from TEST_DEV.IDENTITY.TEST_123 ;",
    output_schema_type=None,
    task_config=SnowflakeConfig(
        account="xxxxx.us-east-1",
        database="TEST_DEV",
        schema="IDENTITY",
        warehouse="DEMO_WH",
    ),
)


@workflow
def no_io_wf():
    return snowflake_task_no_io()
s
Hi Satish, thanks for your question. Could you confirm if by running the following command to check if your secrets have successfully been stored in the pod.
kubectl edit cm flyte-propeller-config
cc @Kevin Su
s
this is what i coud see.
Copy code
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
data:
  admin.yaml: |
    admin:
      clientId: 'flytepropeller'
      clientSecretLocation: /etc/secrets/client_secret
      endpoint: flyteadmin:81
      insecure: true
    event:
      capacity: 1000
      rate: 500
      type: admin
  cache.yaml: |
    cache:
      max_size_mbs: 1024
      target_gc_percent: 70
  catalog.yaml: |
    catalog-cache:
      endpoint: datacatalog:89
      insecure: true
      type: datacatalog
  copilot.yaml: |
    plugins:
      k8s:
        co-pilot:
          image: <http://cr.flyte.org/flyteorg/flytecopilot:v0.0.24|cr.flyte.org/flyteorg/flytecopilot:v0.0.24>
          name: flyte-copilot-
          start-timeout: 30s
  core.yaml: |
    manager:
      pod-application: flytepropeller
      pod-template-container-name: flytepropeller
      pod-template-name: flytepropeller-template
    propeller:
      downstream-eval-duration: 30s
      enable-admin-launcher: true
      gc-interval: 12h
      kube-client-config:
        burst: 25
        qps: 100
        timeout: 30s
      leader-election:
        enabled: true
        lease-duration: 15s
        lock-config-map:
          name: propeller-leader
          namespace: flyte
        renew-deadline: 10s
        retry-period: 2s
      limit-namespace: all
      max-workflow-retries: 50
      metadata-prefix: metadata/propeller
      metrics-prefix: flyte
      prof-port: 10254
      queue:
        batch-size: -1
        batching-interval: 2s
        queue:
          base-delay: 5s
          capacity: 1000
          max-delay: 120s
          rate: 100
          type: maxof
        sub-queue:
          capacity: 1000
          rate: 100
          type: bucket
        type: batch
      rawoutput-prefix: <s3://xxxxxxxxxx/>
      workers: 40
      workflow-reeval-duration: 30s
    webhook:
      certDir: /etc/webhook/certs
      serviceName: flyte-pod-webhook
  enabled_plugins.yaml: |
    tasks:
      task-plugins:
        default-for-task-types:
          container: container
          container_array: k8s-array
          sidecar: sidecar
          spark: spark
        enabled-plugins:
        - container
        - sidecar
        - k8s-array
        - snowflake
        - spark
  k8s.yaml: |
    plugins:
      k8s:
        default-cpus: 100m
        default-env-vars: []
        default-memory: 100Mi
  resource_manager.yaml: |
    propeller:
      resourcemanager:
        type: noop
  spark.yaml: |
    plugins:
      spark:
        logs:
          all-user:
            cloudwatch-enabled: true
            cloudwatch-template-uri: <https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#logStream:group=/aws/vpcflowLogs/dev-max-ml-flyte;prefix=test;streamFilter=typeLogStreamPrefix>
          mixed:
            cloudwatch-enabled: true
            cloudwatch-template-uri: <https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#logStream:group=/aws/vpcflowLogs/dev-max-ml-flyte;prefix=test;streamFilter=typeLogStreamPrefix>
          system:
            cloudwatch-enabled: true
            cloudwatch-template-uri: <https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#logStream:group=/aws/vpcflowLogs/dev-max-ml-flyte;prefix=test;streamFilter=typeLogStreamPrefix>
        spark-config-default:
        - spark.hadoop.fs.s3a.aws.credentials.provider: com.amazonaws.auth.DefaultAWSCredentialsProviderChain
        - spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version: "2"
        - spark.kubernetes.allocation.batch.size: "50"
        - spark.hadoop.fs.s3a.acl.default: BucketOwnerFullControl
        - spark.hadoop.fs.s3n.impl: org.apache.hadoop.fs.s3a.S3AFileSystem
        - spark.hadoop.fs.AbstractFileSystem.s3n.impl: org.apache.hadoop.fs.s3a.S3A
        - spark.hadoop.fs.s3.impl: org.apache.hadoop.fs.s3a.S3AFileSystem
        - spark.hadoop.fs.AbstractFileSystem.s3.impl: org.apache.hadoop.fs.s3a.S3A
        - spark.hadoop.fs.s3a.impl: org.apache.hadoop.fs.s3a.S3AFileSystem
        - spark.hadoop.fs.AbstractFileSystem.s3n.impl: org.apache.hadoop.fs.s3a.S3A
        - spark.hadoop.fs.s3.impl: org.apache.hadoop.fs.s3a.S3AFileSystem
        - spark.hadoop.fs.AbstractFileSystem.s3.impl: org.apache.hadoop.fs.s3a.S3A
        - spark.hadoop.fs.s3a.impl: org.apache.hadoop.fs.s3a.S3AFileSystem
        - spark.hadoop.fs.AbstractFileSystem.s3a.impl: org.apache.hadoop.fs.s3a.S3A
        - spark.hadoop.fs.s3a.multipart.threshold: "536870912"
        - spark.blacklist.enabled: "true"
        - spark.blacklist.timeout: 5m
        - spark.task.maxfailures: "8"
  storage.yaml: |
    storage:
      type: s3
      container: "xxxxxxxxxx"
      connection:
        auth-type: iam
        region: us-east-1
      enable-multicontainer: false
      limits:
        maxDownloadMBs: 10
  task_logs.yaml: |
    plugins:
      logs:
        cloudwatch-enabled: true
        cloudwatch-log-group: /aws/vpcflowLogs/dev-max-ml-flyte
        cloudwatch-region: us-east-1
        kubernetes-enabled: true
        templates:
        - displayName: logs_test
          templateUris:
          - <https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#logEventViewer:group=/aws/vpcflowLogs/dev-max-ml-flyte;stream=var.log.containers.__-.log>
kind: ConfigMap
kind: ConfigMap
metadata:
  annotations:
    <http://meta.helm.sh/release-name|meta.helm.sh/release-name>: max-ml-flyte
    <http://meta.helm.sh/release-namespace|meta.helm.sh/release-namespace>: flyte
  creationTimestamp: "2022-09-14T11:33:04Z"
  labels:
    <http://app.kubernetes.io/instance|app.kubernetes.io/instance>: max-ml-flyte
    <http://app.kubernetes.io/managed-by|app.kubernetes.io/managed-by>: Helm
    <http://app.kubernetes.io/name|app.kubernetes.io/name>: flyteadmin
    <http://helm.sh/chart|helm.sh/chart>: flyte-core-v0.1.10
  name: flyte-propeller-config
  namespace: flyte
  resourceVersion: "57800317"
  uid: xxxxxxxxxxxxxx
k
The secrets should not be in config map, but k8s secrets
s
@Ketan (kumare3) yes. i followed as per the documentation mentioned in above link
k
@Sathish kumar Venkatesan let me look, I think the name of the secret might be a problem- cc @Kevin Su
@Sathish kumar Venkatesan can you share your flytepropeller pod yaml
i want to make sure you mounted the secrets correctly
k
@Sathish kumar Venkatesan Your flytepropeller pod yaml should look like below
Copy code
...
        volumeMounts:
        - mountPath: /etc/flyte/config
          name: config-volume
        - mountPath: /etc/secrets/
          name: snowflake-auth
...
      volumes:
      - configMap:
          defaultMode: 420
          name: flyte-propeller-config
        name: config-volume
      - name: snowflake-auth
        secret:
          defaultMode: 420
          secretName: mysecret
s
@Kevin Su sorry for delay in my response. we are using the helm chart shared by the flyte. https://github.com/flyteorg/flyte/tree/master/charts/flyte-core
I am able to mount the secrete file after making below change from flyte-secret-auth to flyte-propeller-auth in propeller deployment. but this time getting different error while executing snowflake query.
Copy code
cution phase [401].]","ts":"2022-09-26T07:15:38Z"}
{"json":{"exec_id":"f3f19081e615545ee8bc","ns":"cloudops-max-flyte-demo-development","res_ver":"67677691","routine":"worker-22","wf":"cloudops-max-flyte-demo:development:flyte.workflows.sf_execution_try_v1.full_snowflake_wf"},"level":"error","msg":"Error when trying to reconcile workflow. Error [failed at Node[n0]. RuntimeExecutionError: failed during plugin execution, caused by: failed to execute handle for plugin [snowflake]: [SystemError] unknown execution phase [401].]. Error Type[*errors.NodeErrorWithCause]","ts":"2022-09-26T07:15:38Z"}
E0926 07:15:38.997857       1 workers.go:102] error syncing 'cloudops-max-flyte-demo-development/f3f19081e615545ee8bc': failed at Node[n0]. RuntimeExecutionError: failed during plugin execution, caused by: failed to execute handle for plugin [snowflake]: [SystemError] unknown execution phase [401]
/ $ ls -lrt /etc/secrets/* lrwxrwxrwx 1 root nobody 20 Sep 26 06:43 /etc/secrets/client_secret -> ..data/client_secret lrwxrwxrwx 1 root nobody 35 Sep 26 06:43 /etc/secrets/FLYTE_SNOWFLAKE_CLIENT_TOKEN -> ..data/FLYTE_SNOWFLAKE_CLIENT_TOKEN
k
it looks like an access issue. Perhaps the token expired? Try regenerating the jwt token, and update the secret again.
s
let me check on that
@Kevin Su tried with new token but now i am getting 404 😞
Copy code
{"json":{"exec_id":"f2ac69ce6a7204aedbba","ns":"cloudops-max-flyte-demo-development","res_ver":"67767991","routine":"worker-37","wf":"cloudops-max-flyte-demo:development:flyte.workflows.sf_execution_try_v1.full_snowflake_wf"},"level":"error","msg":"Error when trying to reconcile workflow. Error [failed at Node[n0]. RuntimeExecutionError: failed during plugin execution, caused by: failed to execute handle for plugin [snowflake]: [SystemError] unknown execution phase [404].]. Error Type[*errors.NodeErrorWithCause]","ts":"2022-09-26T09:58:54Z"}
E0926 09:58:54.581265       1 workers.go:102] error syncing 'cloudops-max-flyte-demo-development/f2ac69ce6a7204aedbba': failed at Node[n0]. RuntimeExecutionError: failed during plugin execution, caused by: failed to execute handle for plugin [snowflake]: [SystemError] unknown execution phase [404]
k
@Sathish kumar Venkatesan I just updated the endpoint to v2 in propeller, and build a new image. Mind helping me test it by updating the propeller image?
kubectl set image deployment flytepropeller flytepropeller=pingsutw/flytepropeller:2064b6a03361e5b3eb7d5a66340368fd13bde023 -n flyte
s
@Kevin Su thanks kevin. i will try and let you know
i could still see the 404 after updating image.
k
oh, sorry. Maybe there are some other issues in snowflake plugin. let me test it in my dev cluster, then I’ll get back to you
k
Hmm, this should break our end to end tests - cc @Eduardo Apolinario (eapolinario)
s
@Kevin Su would you be able to share the dockerfile file for flytepropeller you have used to build?
s
any recommendation for snowflake 404 error code?
k
There is another endpoint I forgot to update to v2. I just updated all the endpoint to v2, now it’s working for me. feel free to try this image pingsutw/flytepropeller:fcfce5e490382297e88e8f8c5ca02084acf6a4d8
s
@Kevin Su no luck for me. this time i am getting 400 error code
Copy code
E0928 11:34:01.227719       1 workers.go:102] error syncing 'cloudops-max-flyte-demo-development/fa396b5b859f64dc2a26': failed at Node[n0]. RuntimeExecutionError: failed during plugin execution, caused by: failed to execute handle for plugin [snowflake]: [SystemError] unknown execution phase [400]
k
could you send a post request to snowflake server? just want to make sure your jwt token is valid
Copy code
curl -i -X POST \
    -H "Authorization: Bearer ${JWT_TOKEN}" \
    -H "Content-Type: application/json" \
    -H "Accept: application/json" \
    -H "X-Snowflake-Authorization-Token-Type: KEYPAIR_JWT" \
    -d "@request.json" \
    "https://<account identifier>.<http://snowflakecomputing.com/api/v2/statements?async=true|snowflakecomputing.com/api/v2/statements?async=true>"
request.json
Copy code
{
  "statement": "SELECT * from CUSTOMER where C_NATIONKEY = 4 limit 100",
  "timeout": 60,
  "schema": "TPCH_SF1000",
  "database": "SNOWFLAKE_SAMPLE_DATA",
  "warehouse": "COMPUTE_WH"
}
snowflake plugin uses same endpoint, so it should work if all the config is correct.
s
@Kevin Su thanks kevin. i try. can you clarify, Are there any benefits of going with flyte snowflake plugin written in python + go language instead of executing snowflake query directly using python snowflake connector module?
k
So backend plugins can decouple form user code. You can rev the version, update code etc without updating user libraries. Also backend plugins do not run in a pod, they run in the engine very efficient. Last point backend plugins can recover from pod failures, as state is stored independently That being said you should use python plugins to write new plugins quickly and use backend plugins when efficiency etc is a bottleneck We are working to make it easier to write backend plugins, potentially in other languages @Sathish kumar Venkatesan
s
@Kevin Su i suspect 400 issue be due to triple double quotes i added in query_template like """ sql query""" to escape special characters. after removing, it is working fine. job got succeeded.
@Kevin Su are they any best practices to include multi line snowflake statement in query_template to escape special characters.
k
@Sathish kumar Venkatesan sorry Kevin has been out. He will back and should be able to reply
k
@Sathish kumar Venkatesan sorry, seems like snowflake plugin doesn’t handle those special characters. trying to fix it.
@Sathish kumar Venkatesan I just fixed it in this pr. Feel free to try it.
Copy code
pip install git+<https://github.com/flyteorg/flytekit@snowflake-bug>
161 Views